restic

mirror of https://github.com/restic/restic.git synced 2025-03-30 00:00:14 +01:00

Author	SHA1	Message	Date
Gilbert Gilb's	536ebefff4	feat(backends/s3): add warmup support before repacks and restores (#5173 ) * feat(backends/s3): add warmup support before repacks and restores This commit introduces basic support for transitioning pack files stored in cold storage to hot storage on S3 and S3-compatible providers. To prevent unexpected behavior for existing users, the feature is gated behind new flags: - `s3.enable-restore`: opt-in flag (defaults to false) - `s3.restore-days`: number of days for the restored objects to remain in hot storage (defaults to `7`) - `s3.restore-timeout`: maximum time to wait for a single restoration (default to `1 day`) - `s3.restore-tier`: retrieval tier at which the restore will be processed. (default to `Standard`) As restoration times can be lengthy, this implementation preemptively restores selected packs to prevent incessant restore-delays during downloads. This is slightly sub-optimal as we could process packs out-of-order (as soon as they're transitioned), but this would really add too much complexity for a marginal gain in speed. To maintain simplicity and prevent resources exhautions with lots of packs, no new concurrency mechanisms or goroutines were added. This just hooks gracefully into the existing routines. Limitations: - Tests against the backend were not written due to the lack of cold storage class support in MinIO. Testing was done manually on Scaleway's S3-compatible object storage. If necessary, we could explore testing with LocalStack or mocks, though this requires further discussion. - Currently, this feature only warms up before restores and repacks (prune/copy), as those are the two main use-cases I came across. Support for other commands may be added in future iterations, as long as affected packs can be calculated in advance. - The feature is gated behind a new alpha `s3-restore` feature flag to make it explicit that the feature is still wet behind the ears. - There is no explicit user notification for ongoing pack restorations. While I think it is not necessary because of the opt-in flag, showing some notice may improve usability (but would probably require major refactoring in the progress bar which I didn't want to start). Another possibility would be to add a flag to send restores requests and fail early. See https://github.com/restic/restic/issues/3202 * ui: warn user when files are warming up from cold storage * refactor: remove the PacksWarmer struct It's easier to handle multiple handles in the backend directly, and it may open the door to reducing the number of requests made to the backend in the future.	2025-02-01 18:26:27 +00:00
Michael Eischer	97f696b937	backend: remove dead code	2024-08-31 17:25:24 +02:00
Michael Eischer	af989aab4e	backend/layout: unexport fields and simplify rest layout	2024-08-31 17:25:24 +02:00
greatroar	4a874000b7	gs: Replace some errors.Wrap calls The first one in Create is already a WithStack error. The rest were referencing code that hasn't existed for quite some time. Note that errors from Google SDKs tends to start with "google:" or "googleapi:". Also, use restic/internal/errors.	2024-06-01 15:11:06 +02:00
Michael Eischer	0c1ba6d95d	backend: remove unused Location method	2024-05-18 21:38:31 +02:00
Michael Eischer	d40f23e716	azure/b2/gs/s3/swift: adapt cloud backend	2024-05-18 19:54:51 +02:00
Michael Eischer	1b8a67fe76	move Backend interface to backend package	2023-10-25 23:00:18 +02:00
Michael Eischer	7881309d63	backend: move backend implementation helpers to util package This removes code that is only used within a backend implementation from the backend package. The latter now only contains code that also has external users.	2023-10-25 22:54:07 +02:00
Michael Eischer	50e0d5e6b5	backend: Hardcode backend scheme in Factory Our ParseConfig implementations always expect a specific scheme, thus no other scheme would work.	2023-06-17 15:15:58 +02:00
Michael Eischer	7d12c29286	backend: Unify backend construction using factory and registry This unified construction removes most backend-specific code from global.go. The backend registry will also enable integration tests to use custom backends if necessary.	2023-06-17 15:15:57 +02:00
Michael Eischer	56836364a4	backend: pass context into every backend constructor	2023-06-17 15:15:57 +02:00
sohalt	ed5b2c2c9b	gs: support other regions than us	2023-05-26 19:54:42 +02:00
Michael Eischer	c934c99d41	gs: replace usage of context.Background()	2023-04-14 22:32:15 +02:00
Michael Eischer	616926d2c1	gs: use IsNotExist to check error	2023-04-14 22:32:15 +02:00
Michael Eischer	05abc6d6f5	backend: deduplicate implementation of Delete() method	2023-04-14 22:32:15 +02:00
Michael Eischer	8e1e3844aa	backend: factor out connection limiting and parameter validation The SemaphoreBackend now uniformly enforces the limit of concurrent backend operations. In addition, it unifies the parameter validation. The List() methods no longer uses a semaphore. Restic already never runs multiple list operations in parallel. By managing the semaphore in a wrapper backend, the sections that hold a semaphore token grow slightly. However, the main bottleneck is IO, so this shouldn't make much of a difference. The key insight that enables the SemaphoreBackend is that all of the complex semaphore handling in `openReader()` still happens within the original call to `Load()`. Thus, getting and releasing the semaphore tokens can be refactored to happen directly in `Load()`. This eliminates the need for wrapping the reader in `openReader()` to release the token.	2023-04-14 22:32:15 +02:00
Michael Eischer	4703473ec5	backend: extract most debug logs into logger backend	2023-04-14 22:32:15 +02:00
Michael Eischer	8bfc2519d7	backend: Deduplicate sanity checks for parameters of Load() method The check is now handled by backend.DefaultLoad. This also guarantees consistent behavior across all backends.	2023-04-14 22:32:15 +02:00
Michael Eischer	2f934f5803	gs: check against the correct error in IsNotExist	2022-12-03 18:49:54 +01:00
Michael Eischer	04d101fa94	gs/s3: remove useless os.IsNotExist check	2022-12-03 18:49:54 +01:00
Michael Eischer	40ac678252	backend: remove Test method The Test method was only used in exactly one place, namely when trying to create a new repository it was used to check whether a config file already exists. Use a combination of Stat() and IsNotExist() instead.	2022-12-03 11:28:10 +01:00
Michael Eischer	4ccd5e806b	backend: split layout code into own subpackage	2022-10-21 21:36:05 +02:00
greatroar	07e5c38361	errors: Drop Cause in favor of Go 1.13 error handling The only use cases in the code were in errors.IsFatal, backend/b2, which needs a workaround, and backend.ParseLayout. The last of these requires all backends to implement error unwrapping in IsNotExist. All backends except gs already did that.	2022-10-08 13:08:08 +02:00
Michael Eischer	f414db987d	gofmt all files Apparently the rules for comment formatting have changed with go 1.19.	2022-08-19 19:12:26 +02:00
greatroar	910d917b71	backend: Move semaphores to a dedicated package ... called backend/sema. I resisted the temptation to call the main type sema.Phore. Also, semaphores are now passed by value to skip a level of indirection when using them.	2022-06-18 10:01:58 +02:00
Michael Eischer	e36a40db10	upgrade_repo_v2: Use atomic replace for supported backends	2022-05-09 22:31:30 +02:00
Michael Eischer	4f97492d28	Backend: Expose connections parameter	2022-04-23 11:13:08 +02:00
Michael Eischer	0b258cc054	backends: clean reader closing	2022-04-09 12:21:38 +02:00
Michael Eischer	a009b39e4c	gs/swift: calculate md5 content hash for upload	2021-08-04 22:17:46 +02:00
Michael Eischer	9aa2eff384	Add plumbing to calculate backend specific file hash for upload This enables the backends to request the calculation of a backend-specific hash. For the currently supported backends this will always be MD5. The hash calculation happens as early as possible, for pack files this is during assembly of the pack file. That way the hash would even capture corruptions of the temporary pack file on disk.	2021-08-04 22:17:46 +02:00
Michael Eischer	8a486eafed	gs: Don't drop error when finishing upload The error returned when finishing the upload of an object was dropped. This could cause silent upload failures and thus data loss in certain cases. When a MD5 hash for the uploaded blob is specified, a wrong hash/damaged upload would return its error via the Close() whose error was dropped.	2021-01-30 13:31:32 +01:00
Michael Eischer	c73316a111	backends: add sanity check for the uploaded file size Bugs in the error handling while uploading a file to the backend could cause incomplete files, e.g. https://github.com/golang/go/issues/42400 which could affect the local backend. Proactively add sanity checks which will treat an upload as failed if the reported upload size does not match the actual file size.	2021-01-29 13:51:51 +01:00
eleith	a24e986b2b	do not require gs bucket permissions to init repository a gs service account may only have object permissions on an existing bucket but no bucket create/get permissions. these service accounts currently are blocked from initialization a restic repository because restic can not determine if the bucket exists. this PR updates the logic to assume the bucket exists when the bucket attribute request results in a permissions denied error. this way, restic can still initialize a repository if the service account does have object permissions fixes: https://github.com/restic/restic/issues/3100	2020-11-18 06:14:11 -08:00
Ingo Gottwald	8b8e230771	Swap deprecated GCS lib with replacement	2020-10-03 18:55:56 +02:00
Ingo Gottwald	00cedd22aa	Replace deprecated method in gs backend	2020-10-01 10:02:42 +02:00
MichaelEischer	fd02407863	Merge pull request #2849 from classmarkets/gcs-access-token gs: support authentication with access token	2020-09-30 17:42:56 +02:00
aawsome	0fed6a8dfc	Use "pack file" instead of "data file" (#2885 ) - changed variable names, especially changed DataFile into PackFile - changed in some comments - always use "pack file" in docu	2020-08-16 11:16:38 +02:00
Peter Schultz	758b44b9c0	gs: support authentication with access token In the Google Cloud Storage backend, support specifying access tokens directly, as an alternative to a credentials file. This is useful when restic is used non-interactively by some other program that is already authenticated and eliminates the need to store long lived credentials. The access token is specified in the GOOGLE_ACCESS_TOKEN environment variable and takes precedence over GOOGLE_APPLICATION_CREDENTIALS.	2020-07-22 16:23:03 +02:00
Alexander Neumann	c9745cd47e	gs: Respect bandwidth limiting In `0dfdc11ed9`, accidentally we dropped using the provided http.RoundTripper, this commits adds it back. Closes #1989	2018-11-25 18:52:32 +01:00
Lawrence Jones	0dfdc11ed9	Automatically load Google auth This change removes the hardcoded Google auth mechanism for the GCS backend, instead using Google's provided client library to discover and generate credential material. Google recommend that client libraries use their common auth mechanism in order to authorise requests against Google services. Doing so means you automatically support various types of authentication, from the standard GOOGLE_APPLICATION_CREDENTIALS environment variable to making use of Google's metadata API if running within Google Container Engine.	2018-03-11 17:11:25 +00:00
Alexander Neumann	99f7fd74e3	backend: Improve Save() As mentioned in issue [#1560](https://github.com/restic/restic/pull/1560#issuecomment-364689346) this changes the signature for `backend.Save()`. It now takes a parameter of interface type `RewindReader`, so that the backend implementations or our `RetryBackend` middleware can reset the reader to the beginning and then retry an upload operation. The `RewindReader` interface also provides a `Length()` method, which is used in the backend to get the size of the data to be saved. This removes several ugly hacks we had to do to pull the size back out of the `io.Reader` passed to `Save()` before. In the `s3` and `rest` backend this is actively used.	2018-03-03 15:49:44 +01:00
Alexander Neumann	29da86b473	Merge pull request #1623 from restic/backend-relax-restrictions backend: Relax requirement for new files	2018-02-18 12:56:52 +01:00
Alexander Neumann	b5062959c8	backend: Relax requirement for new files Before, all backend implementations were required to return an error if the file that is to be written already exists in the backend. For most backends, that means making a request (e.g. via HTTP) and returning an error when the file already exists. This is not accurate, the file could have been created between the HTTP request testing for it, and when writing starts. In addition, apart from the `config` file in the repo, all other file names have pseudo-random names with a very very low probability of a collision. And even if a file name is written again, the way the restic repo is structured this just means that the same content is placed there again. Which is not a problem, just not very efficient. So, this commit relaxes the requirement to return an error when the file in the backend already exists, which allows reducing the number of API requests and thereby the latency for remote backends.	2018-02-17 22:39:18 +01:00
Igor Fedorenko	d58ae43317	Reworked Backend.Load API to retry errors during ongoing download Signed-off-by: Igor Fedorenko <igor@ifedorenko.com>	2018-02-16 21:12:14 -05:00
Alexander Neumann	5dc8d3588d	GS: Use generic http transport During the development of #1524 I discovered that the Google Cloud Storage backend did not yet use the HTTP transport, so things such as bandwidth limiting did not work. This commit does the necessary magic to make the GS library use our HTTP transport.	2018-01-27 20:12:34 +01:00
Alexander Neumann	e9ea268847	Change List() implementation for all backends	2018-01-21 21:15:09 +01:00
Alexander Neumann	7d8765a937	backend: Only return top-level files for most dirs Fixes #1478	2017-12-14 19:14:16 +01:00
George Armhold	d069ee31b2	GS backend: limit http concurrency in Save(), Stat(), Test(), Remove(), List() as per discussion in PR #1399	2017-10-31 08:01:43 -04:00
Michael Pratt	9fa4f5eb6b	gs: disable resumable uploads By default, the GCS Go packages have an internal "chunk size" of 8MB, used for blob uploads. Media().Do() will buffer a full 8MB from the io.Reader (or less if EOF is reached) then write that full 8MB to the network all at once. This behavior does not play nicely with --limit-upload, which only limits the Reader passed to Media. While the long-term average upload rate will be correctly limited, the actual network bandwidth will be very spikey. e.g., if an 8MB/s connection is limited to 1MB/s, Media().Do() will spend 8s reading from the rate-limited reader (performing no network requests), then 1s writing to the network at 8MB/s. This is bad for network connections hurt by full-speed uploads, particularly when writing 8MB will take several seconds. Disable resumable uploads entirely by setting the chunk size to zero. This causes the io.Reader to be passed further down the request stack, where there is less (but still some) buffering. My connection is around 1.5MB/s up, with nominal ~15ms ping times to 8.8.8.8. Without this change, --limit-upload 1024 results in several seconds of ~200ms ping times (uploading), followed by several seconds of ~15ms ping times (reading from rate-limited reader). A bandwidth monitor reports this as several seconds of ~1.5MB/s followed by several seconds of 0.0MB/s. With this change, --limit-upload 1024 results in ~20ms ping times and the bandwidth monitor reports a constant ~1MB/s. I've elected to make this change unconditional of --limit-upload because the resumable uploads shouldn't be providing much benefit anyways, as restic already uploads mostly small blobs and already has a retry mechanism. --limit-download is not affected by this problem, as Get().Download() returns the real http.Response.Body without any internal buffering. Updates #1216	2017-10-17 21:12:04 -07:00
Michael Pratt	fa0be82da8	gs: allow backend creation without storage.buckets.get If the service account used with restic does not have the storage.buckets.get permission (in the "Storage Admin" role), Create cannot use Get to determine if the bucket is accessible. Rather than always trying to create the bucket on Get error, gracefully fall back to assuming the bucket is accessible. If it is, restic init will complete successfully. If it is not, it will fail on a later call. Here is what init looks like now in different cases. Service account without "Storage Admin": Bucket exists and is accessible (this is the case that didn't work before): $ ./restic init -r gs:this-bucket-does-exist:/ enter password for new backend: enter password again: created restic backend c02e2edb67 at gs:this-bucket-does-exist:/ Please note that knowledge of your password is required to access the repository. Losing your password means that your data is irrecoverably lost. Bucket exists but is not accessible: $ ./restic init -r gs:this-bucket-does-exist:/ enter password for new backend: enter password again: create key in backend at gs:this-bucket-does-exist:/ failed: service.Objects.Insert: googleapi: Error 403: my-service-account@myproject.iam.gserviceaccount.com does not have storage.objects.create access to object this-bucket-exists/keys/0fa714e695c8ecd58cb467cdeb04d36f3b710f883496a90f23cae0315daf0b93., forbidden Bucket does not exist: $ ./restic init -r gs:this-bucket-does-not-exist:/ create backend at gs:this-bucket-does-not-exist:/ failed: service.Buckets.Insert: googleapi: Error 403: my-service-account@myproject.iam.gserviceaccount.com does not have storage.buckets.create access to bucket this-bucket-does-not-exist., forbidden Service account with "Storage Admin": Bucket exists and is accessible: Same Bucket exists but is not accessible: Same. Previously this would fail when Create tried to create the bucket. Now it fails when trying to create the keys. Bucket does not exist: $ ./restic init -r gs:this-bucket-does-not-exist:/ enter password for new backend: enter password again: created restic backend c3c48b481d at gs:this-bucket-does-not-exist:/ Please note that knowledge of your password is required to access the repository. Losing your password means that your data is irrecoverably lost.	2017-09-25 22:25:51 -07:00

1 2

57 commits