2015-12-28 15:51:24 +01:00
package s3
import (
2015-12-28 18:23:02 +01:00
"net/url"
2023-04-15 10:25:45 +02:00
"os"
2016-02-14 11:26:46 -08:00
"path"
2015-12-28 15:51:24 +01:00
"strings"
feat(backends/s3): add warmup support before repacks and restores (#5173)
* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
2025-02-01 19:26:27 +01:00
"time"
2016-08-21 17:46:23 +02:00
2023-10-01 11:40:12 +02:00
"github.com/restic/restic/internal/backend"
2017-07-23 14:21:03 +02:00
"github.com/restic/restic/internal/errors"
"github.com/restic/restic/internal/options"
2015-12-28 15:51:24 +01:00
)
// Config contains all configuration necessary to connect to an s3 compatible
// server.
type Config struct {
2021-08-04 22:56:18 +02:00
Endpoint string
UseHTTP bool
KeyID string
Secret options . SecretString
Bucket string
Prefix string
2024-03-29 13:51:59 +01:00
Layout string ` option:"layout" help:"use this backend layout (default: auto-detect) (deprecated)" `
2021-08-04 22:56:18 +02:00
StorageClass string ` option:"storage-class" help:"set S3 storage class (STANDARD, STANDARD_IA, ONEZONE_IA, INTELLIGENT_TIERING or REDUCED_REDUNDANCY)" `
2017-06-06 00:17:39 +02:00
feat(backends/s3): add warmup support before repacks and restores (#5173)
* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
2025-02-01 19:26:27 +01:00
EnableRestore bool ` option:"enable-restore" help:"restore objects from GLACIER or DEEP_ARCHIVE storage classes (default: false, requires \"s3-restore\" feature flag)" `
RestoreDays int ` option:"restore-days" help:"lifetime in days of restored object (default: 7)" `
RestoreTimeout time . Duration ` option:"restore-timeout" help:"maximum time to wait for objects transition (default: 1d)" `
RestoreTier string ` option:"restore-tier" help:"Retrieval tier at which the restore will be processed. (Standard, Bulk or Expedited) (default: Standard)" `
2024-07-08 19:42:00 +02:00
Connections uint ` option:"connections" help:"set a limit for the number of concurrent connections (default: 5)" `
MaxRetries uint ` option:"retries" help:"set the number of retries attempted" `
Region string ` option:"region" help:"set region" `
BucketLookup string ` option:"bucket-lookup" help:"bucket lookup style: 'auto', 'dns', or 'path'" `
ListObjectsV1 bool ` option:"list-objects-v1" help:"use deprecated V1 api for ListObjects calls" `
UnsafeAnonymousAuth bool ` option:"unsafe-anonymous-auth" help:"use anonymous authentication" `
2017-06-06 00:17:39 +02:00
}
// NewConfig returns a new Config with the default values filled in.
func NewConfig ( ) Config {
return Config {
feat(backends/s3): add warmup support before repacks and restores (#5173)
* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
2025-02-01 19:26:27 +01:00
Connections : 5 ,
ListObjectsV1 : false ,
EnableRestore : false ,
RestoreDays : 7 ,
RestoreTimeout : 24 * time . Hour ,
RestoreTier : "Standard" ,
2017-06-06 00:17:39 +02:00
}
2017-05-15 23:37:02 +02:00
}
func init ( ) {
options . Register ( "s3" , Config { } )
2015-12-28 15:51:24 +01:00
}
// ParseConfig parses the string s and extracts the s3 config. The two
2016-02-07 11:28:29 -08:00
// supported configuration formats are s3://host/bucketname/prefix and
2017-11-20 22:21:39 +01:00
// s3:host/bucketname/prefix. The host can also be a valid s3 region
2016-02-07 11:28:29 -08:00
// name. If no prefix is given the prefix "restic" will be used.
2023-04-21 21:35:34 +02:00
func ParseConfig ( s string ) ( * Config , error ) {
2016-02-14 07:01:14 -08:00
switch {
case strings . HasPrefix ( s , "s3:http" ) :
2016-02-07 11:28:29 -08:00
// assume that a URL has been specified, parse it and
// use the host as the endpoint and the path as the
// bucket name and prefix
2016-02-14 09:10:45 -08:00
url , err := url . Parse ( s [ 3 : ] )
2015-12-28 18:23:02 +01:00
if err != nil {
2023-04-21 21:35:34 +02:00
return nil , errors . WithStack ( err )
2015-12-28 18:23:02 +01:00
}
if url . Path == "" {
2023-04-21 21:35:34 +02:00
return nil , errors . New ( "s3: bucket name not found" )
2015-12-28 18:23:02 +01:00
}
2022-11-27 18:09:59 +01:00
bucket , path , _ := strings . Cut ( url . Path [ 1 : ] , "/" )
return createConfig ( url . Host , bucket , path , url . Scheme == "http" )
2016-02-14 09:45:58 -08:00
case strings . HasPrefix ( s , "s3://" ) :
s = s [ 5 : ]
2016-02-14 07:01:14 -08:00
case strings . HasPrefix ( s , "s3:" ) :
2016-02-07 11:28:29 -08:00
s = s [ 3 : ]
2016-02-14 07:01:14 -08:00
default :
2023-04-21 21:35:34 +02:00
return nil , errors . New ( "s3: invalid format" )
2016-02-07 11:28:29 -08:00
}
2016-02-14 09:45:58 -08:00
// use the first entry of the path as the endpoint and the
// remainder as bucket name and prefix
2022-11-27 18:09:59 +01:00
endpoint , rest , _ := strings . Cut ( s , "/" )
bucket , prefix , _ := strings . Cut ( rest , "/" )
return createConfig ( endpoint , bucket , prefix , false )
2016-02-14 09:10:45 -08:00
}
2023-04-21 21:35:34 +02:00
func createConfig ( endpoint , bucket , prefix string , useHTTP bool ) ( * Config , error ) {
2022-11-27 18:09:59 +01:00
if endpoint == "" {
2023-04-21 21:35:34 +02:00
return nil , errors . New ( "s3: invalid format, host/region or bucket name not found" )
2017-11-20 22:29:15 +01:00
}
2022-11-27 18:09:59 +01:00
if prefix != "" {
prefix = path . Clean ( prefix )
2016-02-07 11:28:29 -08:00
}
2017-11-20 22:29:15 +01:00
2017-06-06 00:17:39 +02:00
cfg := NewConfig ( )
cfg . Endpoint = endpoint
cfg . UseHTTP = useHTTP
2022-11-27 18:09:59 +01:00
cfg . Bucket = bucket
2017-06-06 00:17:39 +02:00
cfg . Prefix = prefix
2023-04-21 21:35:34 +02:00
return & cfg , nil
2015-12-28 15:51:24 +01:00
}
2023-04-15 10:25:45 +02:00
2023-10-01 11:40:12 +02:00
var _ backend . ApplyEnvironmenter = & Config { }
2023-04-21 21:51:58 +02:00
2023-04-15 10:25:45 +02:00
// ApplyEnvironment saves values from the environment to the config.
2023-06-08 15:28:07 +02:00
func ( cfg * Config ) ApplyEnvironment ( prefix string ) {
2023-04-15 10:25:45 +02:00
if cfg . KeyID == "" {
2023-04-21 21:51:58 +02:00
cfg . KeyID = os . Getenv ( prefix + "AWS_ACCESS_KEY_ID" )
2023-04-15 10:25:45 +02:00
}
if cfg . Secret . String ( ) == "" {
2023-04-21 21:51:58 +02:00
cfg . Secret = options . NewSecretString ( os . Getenv ( prefix + "AWS_SECRET_ACCESS_KEY" ) )
2023-04-15 10:25:45 +02:00
}
if cfg . Region == "" {
2023-04-21 21:51:58 +02:00
cfg . Region = os . Getenv ( prefix + "AWS_DEFAULT_REGION" )
2023-04-15 10:25:45 +02:00
}
}