We try to distinguish truly block-cloned (i.e. "reflinked") files from
files that we failed to clone and instead had to use `io.Copy()` on,
because this might be useful to the operator/administrator to gauge the
space savings or gain awareness of filesystem configuration problems.
Note that "cloning" means either true block cloning via `ioctl(FICLONE)`
or any kind of local copy in general via fallback to `io.Copy()`.
TBD:
- fallback to restoring a file normally if it could not be cloned
- track all potential cloning sources for each file (every copy that was
restored rather than cloned) and try to clone from each copy in turn
(Why: imagine a set of duplicate files being restored N:M to a set of
distinct filesystems or subvolumes or datasets, such that some pairs
can be used as operands to a block cloning operation and some cannot)
- progress reporting (report how much space we have saved, or not)
Non-goals:
- cloning individual blobs via `ioctl(FICLONERANGE)`
(Why: this is not going to work very well, if at all, due to blobs not
being aligned to any kind of a fundamental block size, _and_ this
will impact cloning entire files unless the latter is special-cased,
which is exactly what is being done here.)
In its minimal form, ReflinkIndex associates unique file contents
(stored as sequences of blob IDs comprising said content) to the first
encountered file name for each content, which are considered "originals"
for subsequent cloning.
Passing `restic restore --reflinks` will cause restic to keep track of
duplicate files and locally clone subsequent duplicates from the first
extracted one.
Enhancement: create ability to sort output of restic ls -l by
name, size, atime, ctime, mtime, time(=mtime), X(=extension), extension
---------
Co-authored-by: Michael Eischer <michael.eischer@fau.de>
* sometimes, the report function may absorb the error and return nil, in those cases the bar.Add(1) method would execute even if the file deletion had failed
* feat(backends/s3): add warmup support before repacks and restores
This commit introduces basic support for transitioning pack files stored
in cold storage to hot storage on S3 and S3-compatible providers.
To prevent unexpected behavior for existing users, the feature is gated
behind new flags:
- `s3.enable-restore`: opt-in flag (defaults to false)
- `s3.restore-days`: number of days for the restored objects to remain
in hot storage (defaults to `7`)
- `s3.restore-timeout`: maximum time to wait for a single restoration
(default to `1 day`)
- `s3.restore-tier`: retrieval tier at which the restore will be
processed. (default to `Standard`)
As restoration times can be lengthy, this implementation preemptively
restores selected packs to prevent incessant restore-delays during
downloads. This is slightly sub-optimal as we could process packs
out-of-order (as soon as they're transitioned), but this would really
add too much complexity for a marginal gain in speed.
To maintain simplicity and prevent resources exhautions with lots of
packs, no new concurrency mechanisms or goroutines were added. This just
hooks gracefully into the existing routines.
**Limitations:**
- Tests against the backend were not written due to the lack of cold
storage class support in MinIO. Testing was done manually on
Scaleway's S3-compatible object storage. If necessary, we could
explore testing with LocalStack or mocks, though this requires further
discussion.
- Currently, this feature only warms up before restores and repacks
(prune/copy), as those are the two main use-cases I came across.
Support for other commands may be added in future iterations, as long
as affected packs can be calculated in advance.
- The feature is gated behind a new alpha `s3-restore` feature flag to
make it explicit that the feature is still wet behind the ears.
- There is no explicit user notification for ongoing pack restorations.
While I think it is not necessary because of the opt-in flag, showing
some notice may improve usability (but would probably require major
refactoring in the progress bar which I didn't want to start). Another
possibility would be to add a flag to send restores requests and fail
early.
See https://github.com/restic/restic/issues/3202
* ui: warn user when files are warming up from cold storage
* refactor: remove the PacksWarmer struct
It's easier to handle multiple handles in the backend directly, and it
may open the door to reducing the number of requests made to the backend
in the future.
The old sorting behaviour was to sort snapshots from oldest to newest.
The new sorting order is from newest to oldest. If one wants to revert to the
old behaviour, use the option --reverse.
---------
Co-authored-by: Michael Eischer <michael.eischer@fau.de>