Commit Graph

7947 Commits

Author SHA1 Message Date
Michael Eischer c0e1f36830 forget: refuse deleting the last snapshot in a snapshot group
`--keep-tag invalid-tag` was previously able to wipe all snapshots in a
repository. As a user specified a `--keep-*` option this is likely
unintentional. This forbid deleting all snapshot if a `--keep-*` option
was specified to prevent data loss. (Not specifying such an option
currently also causes the command to abort)
2024-05-24 20:45:33 +02:00
Michael Eischer d106ad6921 restic: regenerate snapshot keep policy golden test files 2024-05-24 20:45:33 +02:00
Michael Eischer 16ef4d515b
Merge pull request #4784 from MichaelEischer/rework-backend-retries
Rework backend retries
2024-05-24 20:29:54 +02:00
Michael Eischer e4a48085ae backend/retry: feature flag new retry behavior 2024-05-24 20:24:02 +02:00
Michael Eischer 723247c8e5 add changelog for longer retries 2024-05-24 20:24:02 +02:00
Michael Eischer b1266867d2 repository: wait max 1 minutes for lock removal if context is canceled
The toplevel context in restic only canceled if the user interrupts a
restic operation. If the network connection has failed this can require
waiting the full retry duration of 15 minutes which is a bad user
experience for interactive usage. Thus limit the delay to one minute in
this case.
2024-05-24 20:24:02 +02:00
Michael Eischer 98709a4372 retry: reduce total number of retries
Retries in restic try to solve two main problems:
- retry a temporarily failed operation
- tolerate temporary network interruptions

The first problem only requires a few retries, whereas the last one benefits
primarily from spreading the requests over a longer duration.

Increasing the default multiplier and the initial interval works for
both cases. The first few retries only take a few seconds, while later
retries quickly reach the maximum interval of one minute. This ensures
that the total number of retries issued by restic will remain at around
21 retries for a 15 minute period. As the concurrency in restic is
bounded, retries drastically reduce the number of requests sent to a
backend. This helps to prevent overloading the backend.
2024-05-24 20:24:02 +02:00
Michael Eischer 512cd6ef07 retry: ensure that there's always at least one retry
Previously, if an operation failed after 15 minutes, then it would never
be retried. This means that large backend requests are more unreliable
than smaller ones.
2024-05-24 20:24:02 +02:00
Michael Eischer a60ee9b764 retry: limit retries based on elapsed time not count
Depending on how long an operation takes to fail, the total retry
duration can currently vary between 1.5 and 15 minutes. In particular
for temporarily interrupted network connections, the former timeout is
too short. Thus always use a limit of 15 minutes.
2024-05-24 20:24:02 +02:00
Michael Eischer a3633cad9e retry: explicitly log failed requests
This simplifies finding the request in the log output that cause an
operation to fail.
2024-05-24 20:24:02 +02:00
Michael Eischer b9cbf623fa
Merge pull request #4814 from MichaelEischer/fix-tmp-docs
doc: fix tmpdir documentation for windows
2024-05-24 18:17:59 +02:00
Michael Eischer 4021e67d97 doc: fix tmpdir documentation for windows 2024-05-20 20:48:29 +02:00
Michael Eischer 8898f61717
Merge pull request #4809 from MichaelEischer/update-changelog
add retries for corrupted blobs to changelog
2024-05-18 23:04:13 +02:00
Michael Eischer 5f23baabcc add retries for corrupted blobs to changelog 2024-05-18 23:03:24 +02:00
Michael Eischer 9c5bac6f25
Merge pull request #4799 from letmaik/letmaik/azure-force-cli-credential
Azure: add option to force use of CLI credential
2024-05-18 20:22:15 +00:00
Michael Eischer c56ecec9aa azure: deduplicate cli and default credentials case 2024-05-18 22:15:54 +02:00
Maik Riechert 355f520936 Azure: add option to force use of CLI credential 2024-05-18 22:15:54 +02:00
Michael Eischer 1dfe1b8732
Merge pull request #4802 from MichaelEischer/backend-cleanups
Repository: Remove Backend() method
2024-05-18 22:02:45 +02:00
Michael Eischer 223aa22cb0 replace some uses of restic.Repository with finegrained interfaces 2024-05-18 21:42:51 +02:00
Michael Eischer 291c9677de restic/repository: remove Backend() method 2024-05-18 21:42:51 +02:00
Michael Eischer 673496b091 repository: clean cache between CheckPack retries
The cache cleanup pattern is also used in ListPack etc.
2024-05-18 21:42:51 +02:00
Michael Eischer 3d2410ed50 Replace some repo.RemoveUnpacked usages
These will eventually be blocked as they do not delete Snapshots.
2024-05-18 21:42:51 +02:00
Michael Eischer d2c26e33f3 repository: remove further usages of repo.Backend() 2024-05-18 21:42:51 +02:00
Michael Eischer 8a425c2f0a remove usages of repo.Backend() from tests 2024-05-18 21:42:51 +02:00
Michael Eischer aa4647f773 repository: unexport PackBlobIterator 2024-05-18 21:42:51 +02:00
Michael Eischer 94e863885c check: move verification of individual pack file to repository 2024-05-18 21:42:50 +02:00
Michael Eischer e40943a75d restic: remove backend usage from lock test 2024-05-18 21:38:31 +02:00
Michael Eischer 67e2ba0d40 repository: Lock requires *repository.Repository
This allows the Lock function to access the backend, even once the
Backend method is removed from the interface.
2024-05-18 21:38:31 +02:00
Michael Eischer d8b184b3d3 repository: convert test helper to return *repository.Repository 2024-05-18 21:38:31 +02:00
Michael Eischer a1ca5e15c4 migrations: add temporary hack for s3_layout
The migration will be removed after the next restic release anyways.
Thus, there's no need for a clean implementation.
2024-05-18 21:38:31 +02:00
Michael Eischer 34d90aecf9 migrations: move logic of upgrade_repo_v2 to repository package
The migration modifies repository internals and thus should live within
the repository package.
2024-05-18 21:38:31 +02:00
Michael Eischer ab9077bc13 replace usages of backend.Remove() with repository.RemoveUnpacked()
RemoveUnpacked will eventually block removal of all filetypes other than
snapshots. However, getting there requires a major refactor to provide
some components with privileged access.
2024-05-18 21:38:31 +02:00
Michael Eischer 8274f5b101 prune: remove Backend.IsNotExist()
Only handling one specific error is not particularly useful.
2024-05-18 21:38:31 +02:00
Michael Eischer 9795198189 debug: remove Backend.Stat() usage 2024-05-18 21:38:31 +02:00
Michael Eischer 0c1ba6d95d backend: remove unused Location method 2024-05-18 21:38:31 +02:00
Michael Eischer eb6c653f89
Merge pull request #4800 from MichaelEischer/cleanup-load
Retry loading of corrupted data from backend / cache
2024-05-18 21:34:54 +02:00
Michael Eischer 74d90653e0 check: use ReadFull to load pack header in checkPack
This ensures that the pack header is actually read completely.
Previously, for a truncated file it was possible to only read a part of
the header, as backend.Load(...) is not guaranteed to return as many
bytes as requested by the length parameter.
2024-05-18 21:28:54 +02:00
Michael Eischer 8f8d872a68 fix compatibility with go 1.19 2024-05-18 21:28:54 +02:00
Michael Eischer ff0744b3af check: test checkPack retries 2024-05-18 21:28:54 +02:00
Michael Eischer 987c3b250c repository: test retries of ListPack 2024-05-18 21:28:54 +02:00
Michael Eischer bf16096771 repository: test LoadBlob retries 2024-05-18 21:28:54 +02:00
Michael Eischer 4f45668b7c repository: rework and extend LoadRaw tests 2024-05-18 21:28:54 +02:00
Michael Eischer ac805d6838 cache: cleanup debug logs 2024-05-18 21:28:54 +02:00
Michael Eischer 5214af88e2 cache: test forget behavior 2024-05-18 21:28:54 +02:00
Michael Eischer 3ff063e913 check: verify pack a second time if broken 2024-05-18 21:28:54 +02:00
Michael Eischer 385cee09dc repository: fix caching of tree packs in LoadBlobsFromPack 2024-05-18 21:28:54 +02:00
Michael Eischer e734746f75 cache: forget cached file at most once
This is inspired by the circuit breaker pattern used for distributed
systems. If too many requests fails, then it is better to immediately
fail new requests for a limited time to give the backend time to
recover.

By only forgetting a file in the cache at most once, we can ensure that
a broken file is only retrieved once again from the backend. If the file
stored there is broken, previously it would be cached and deleted
continuously. Now, it is retrieved only once again, all later requests
just use the cached copy and either succeed or fail immediately.
2024-05-18 21:28:54 +02:00
Michael Eischer 97a307df1a cache: Always use cached file if it exists
A file is always cached whole. Thus, any out of bounds access will also
fail when directed at the backend. To handle case in which the cached
file is broken, then caller must call Cache.Forget(h) for the file in
question.
2024-05-18 21:28:54 +02:00
Michael Eischer 8cce06d915 repair packs: drop experimental warning
This warning should already have been removed once the feature flag was
dropped.
2024-05-18 21:28:54 +02:00
Michael Eischer 433a6aad29 repository: remove redundant blob loading fallback from RepairPacks
LoadBlobsFromPack already implements the same fallback behavior.
2024-05-18 21:28:54 +02:00