Commit Graph

234 Commits

Author SHA1 Message Date
Michael Eischer c1a8fa4290 repository: remove unused packIDToIndex field 2022-07-02 18:39:59 +02:00
Michael Eischer e68c3a4e62 repository: simplify CreateIndexFromPacks 2022-07-02 18:39:59 +02:00
Michael Eischer 1974ad7ce2 repository: hide MasterIndex.FinalizeFullIndexes / FinalizeNotFinalIndexes 2022-07-02 18:39:59 +02:00
Michael Eischer ef53ca4a5a repository: remove MasterIndex.All() 2022-07-02 18:39:59 +02:00
Michael Eischer bf81bf0795 repository: Properly set id for finalized index
As MergeFinalIndex and index uploads can occur concurrently, it is
necessary for MergeFinalIndex to check whether the IDs for an index were
already set before merging it. Otherwise, we'd loose the ID of an index
which is set _after_ uploading it.
2022-07-02 18:39:59 +02:00
Michael Eischer e0a7852b8b repository: remove unused (Master)Index.Count 2022-07-02 18:39:58 +02:00
Michael Eischer 8ef2968f28 repository: remove unused index.ListPack 2022-07-02 18:39:12 +02:00
Michael Eischer e4f20dea61 repository: inline index.encode 2022-07-02 18:39:12 +02:00
Michael Eischer fe5a8e137a repository: remove unused index.Store 2022-07-02 18:39:12 +02:00
Michael Eischer 628ae799ca repository: make flushPacks private 2022-07-02 18:39:12 +02:00
Michael Eischer ed8aa15376 repository: add Save method to MasterIndex interface 2022-07-02 18:38:56 +02:00
Michael Eischer a77d5c4d11 repository: index saving belongs into the MasterIndex 2022-07-02 18:38:56 +02:00
Jayson Wang f144920ed5 fix handling of maxKeys in SearchKey 2022-06-12 14:19:06 +02:00
greatroar c9557b2822 internal/repository: Fix LoadBlob + fuzz test
When given a buf that is big enough for a compressed blob but not its
decompressed contents, the copy at the end of LoadBlob would skip the
last part of the contents.

Fixes #3783.
2022-06-06 17:02:28 +02:00
MichaelEischer b2a2e5f727
Merge pull request #3753 from greatroar/indexmap-alloc
repository: Re-tune indexmap allocation strategy
2022-05-14 15:44:08 +02:00
greatroar 5141228e0c repository: Re-tune indexmap allocation strategy
fd05037e1a changed the allocation batch
size from 256 to 128 under the assumption that an indexEntry is 60 bytes
on amd64, but it's 64: structs are padded out to a multiple of 8 for
alignment reasons. That means we'd waste no space in malloc even without
the batch allocation, at least on 64-bit machines. While that strategy
cuts the overallocation down dramatically for many small indexes, it also
seems to slow allocation down (Go 1.18, Linux, amd64, -benchtime=2s):

    name                   old time/op    new time/op    delta
    DecodeIndex-8             4.67s ± 5%     4.60s ± 1%      ~     (p=0.953 n=10+5)
    DecodeIndexParallel-8     4.67s ± 3%     4.60s ± 1%      ~     (p=0.953 n=10+5)
    IndexHasUnknown-8        37.8ns ± 8%    36.5ns ±14%      ~     (p=0.841 n=5+5)
    IndexHasKnown-8          38.5ns ±12%    37.7ns ±10%      ~     (p=0.968 n=5+5)
    IndexAlloc-8              615ms ±18%     607ms ± 1%      ~     (p=1.000 n=10+5)
    IndexAllocParallel-8      245ms ±11%     285ms ± 6%   +16.40%  (p=0.001 n=10+5)
    MasterIndexAlloc-8        286ms ± 9%     275ms ± 2%      ~     (p=1.000 n=10+5)
    LoadIndex/v1-8           27.0ms ± 4%    26.8ms ± 1%      ~     (p=0.690 n=5+5)
    LoadIndex/v2-8           22.4ms ± 1%    22.8ms ± 2%    +1.48%  (p=0.016 n=5+5)

    name                   old alloc/op   new alloc/op   delta
    IndexAlloc-8              446MB ± 0%     446MB ± 0%    -0.00%  (p=0.000 n=8+4)
    IndexAllocParallel-8      446MB ± 0%     446MB ± 0%    -0.00%  (p=0.008 n=8+5)
    MasterIndexAlloc-8        213MB ± 0%     159MB ± 0%   -25.47%  (p=0.000 n=10+5)

    name                   old allocs/op  new allocs/op  delta
    IndexAlloc-8               913k ± 0%     2632k ± 0%  +188.19%  (p=0.008 n=5+5)
    IndexAllocParallel-8       913k ± 0%     2632k ± 0%  +188.21%  (p=0.008 n=5+5)
    MasterIndexAlloc-8         318k ± 0%     1172k ± 0%  +267.86%  (p=0.008 n=5+5)

Instead, this patch sets a batch size of 4, which means no space is
wasted by malloc on 64-bit and very little on 32-bit. It still gets very
close to the savings from not allocating in batches, without requiring
special code for bits.UintSize==64. Benchmark results, again for
Linux/amd64:

    name                   old time/op    new time/op    delta
    DecodeIndex-8             4.67s ± 5%     4.83s ± 9%     ~     (p=0.315 n=10+10)
    DecodeIndexParallel-8     4.67s ± 3%     4.68s ± 4%     ~     (p=0.315 n=10+10)
    IndexHasUnknown-8        37.8ns ± 8%    44.5ns ±19%     ~     (p=0.095 n=5+5)
    IndexHasKnown-8          38.5ns ±12%    36.9ns ± 8%     ~     (p=0.690 n=5+5)
    IndexAlloc-8              615ms ±18%     628ms ±18%     ~     (p=0.218 n=10+10)
    IndexAllocParallel-8      245ms ±11%     262ms ± 9%   +7.02%  (p=0.043 n=10+10)
    MasterIndexAlloc-8        286ms ± 9%     287ms ±13%     ~     (p=1.000 n=10+10)
    LoadIndex/v1-8           27.0ms ± 4%    26.8ms ± 0%     ~     (p=1.000 n=5+5)
    LoadIndex/v2-8           22.4ms ± 1%    22.5ms ± 0%     ~     (p=0.056 n=5+5)

    name                   old alloc/op   new alloc/op   delta
    IndexAlloc-8              446MB ± 0%     446MB ± 0%     ~     (p=1.000 n=8+10)
    IndexAllocParallel-8      446MB ± 0%     446MB ± 0%   -0.00%  (p=0.000 n=8+8)
    MasterIndexAlloc-8        213MB ± 0%     160MB ± 0%  -25.02%  (p=0.000 n=10+9)

    name                   old allocs/op  new allocs/op  delta
    IndexAlloc-8               913k ± 0%     1333k ± 0%  +45.94%  (p=0.000 n=8+10)
    IndexAllocParallel-8       913k ± 0%     1333k ± 0%  +45.94%  (p=0.000 n=8+8)
    MasterIndexAlloc-8         318k ± 0%      525k ± 0%  +64.99%  (p=0.000 n=10+10)

The allocation method indexmap.newEntry has also been rewritten in a
form that is a few instructions shorter.
2022-05-11 21:22:14 +02:00
MichaelEischer df554e5f69
Merge pull request #3748 from greatroar/runworkers
repository: Remove RunWorkers, report ctx.Err()
2022-05-11 19:38:46 +02:00
greatroar 2e0f1f5113 repository: Remove RunWorkers, report ctx.Err()
This removes RunWorkers, which had become mere overhead by successive
refactors. It also ensures that each former user of that function
returns any context error that occurs, so failure to complete an
operation is always reported as an error.
2022-05-10 22:26:00 +02:00
Michael Eischer ae7e51382a Fix error on temp file deletion on windows
Apparently it can take a moment between closing a tempfile marked as
DELETE_ON_CLOSE and it actually being deleted. During that time the file
is inaccessible. Thus just skip deleting the temp file on windows.
2022-05-09 22:43:26 +02:00
Michael Eischer cf5cb673fb repository: Use existing method to collect pack ids 2022-04-30 19:14:21 +02:00
Michael Eischer b335cb6285 repository: Refactor index IDs collection 2022-04-30 19:14:21 +02:00
Michael Eischer 4b01b06f2f repository: Test compressed blobs in StreamPack 2022-04-30 11:34:10 +02:00
Michael Eischer ec2b25565a repository: test uncompressedLength field and index example 2022-04-30 11:34:10 +02:00
Michael Eischer 9ffb8920f1 repository: run blackbox tests using old and new repo version 2022-04-30 11:34:10 +02:00
Michael Eischer abe5935693 repository: unify repository version-specific initialization
Mark the master index as compressed also when initializing a new
repository. This is only relevant for testing.
2022-04-30 11:34:10 +02:00
Alexander Neumann 8776031f96 Leave allocating slices to the decompress code 2022-04-30 11:34:10 +02:00
Alexander Neumann 5eb05a0afe Configure zstd encoder/decoder 2022-04-30 11:34:10 +02:00
Michael Eischer 2f36e044db Cleanup pack header check 2022-04-30 11:34:10 +02:00
Alexander Neumann 8b11b86383 Add option global --compression 2022-04-30 11:34:10 +02:00
Michael Eischer 7132df529e repository: Increase index size for repo version 2
A compressed index is only about one third the size of an uncompressed
one. Thus increase the number of entries in an index to avoid cluttering
the repository with small indexes.
2022-04-30 11:34:10 +02:00
Michael Eischer 66f9048bce repository: Alloc zstd encoder/decoder on demand 2022-04-30 11:34:10 +02:00
Michael Eischer fd05037e1a repository: recalibrate index batch allocation size 2022-04-30 11:34:10 +02:00
Michael Eischer 6fb408d90e repository: implement pack compression 2022-04-30 11:34:10 +02:00
Michael Eischer 362ab06023 init: Add flag to specify created repository version 2022-04-30 10:07:42 +02:00
Michael Eischer 4b957e7373 repository: Implement index/snapshot/lock compression
The config file is not compressed as it should remain readable by older
restic versions such that these can return a proper error.

As the old format for unpacked data does not include a version header,
make use of a trick: The old data is always encoded as JSON. Thus it can
only start with '{' or '['. For any other value the first byte indicates
a versioned format. The version is set to 2 for now. Then the zstd
compressed data follows.
2022-04-30 10:07:42 +02:00
Michael Eischer e597b99b55 repository: Reduce repack workers to prevent deadlock
As repack streams packs these occupy one backend connection. Uploading a
new pack also requires a backend connection. To prevent a deadlock
during repack when reaching the backend connections limit, simply limit
the repackWorker count to always leave one connection for uploading.
2022-04-23 11:28:18 +02:00
Alexander Neumann a059ef90f8
Merge pull request #3702 from MichaelEischer/extend-config-error
Print used key name if config fails to load
2022-04-10 20:25:24 +02:00
Michael Eischer c2aabb2686 Print used key name if config fails to load 2022-04-09 22:38:18 +02:00
Alexander Neumann 04e054465a
Merge pull request #3475 from MichaelEischer/local-sftp-conn-limit
Limit concurrent operations for local / sftp backend
2022-04-09 21:33:00 +02:00
Michael Eischer 7b9ae91e04 copy: Load snapshots before indexes 2022-04-09 12:27:25 +02:00
Michael Eischer cd783358d3 local: Limit concurrent backend operations
Use a limit of 2 similar to the filereader concurrency in the archiver.
2022-04-09 12:21:38 +02:00
Michael Eischer 6408686973 repository: Simplify Blob equality check 2022-03-28 22:09:49 +02:00
Michael Eischer 243698680a crypto: Use helpers for size calculations 2022-03-28 22:09:49 +02:00
Michael Eischer f78bd14e28 repository: Remove pack implementation details from MasterIndex 2022-03-28 22:09:49 +02:00
Michael Eischer dc3d77dacc repository: make saveAndEncrypt private 2022-03-28 22:09:49 +02:00
Michael Eischer 6877e7edbb repository: Rename LoadAndDecrypt to LoadUnpacked
The method is the complement for SaveUnpacked and not for
SaveAndEncrypt. The latter assembles blobs into pack files.
2022-03-28 22:09:49 +02:00
Michael Eischer 537b4c310a copy: Implement by reusing repack
The repack operation copies all selected blobs from a set of pack files
into new pack files. For prune the source and destination repositories
are identical. To implement copy, just use a different source and
destination repository.
2022-03-26 20:47:15 +01:00
Alexander Neumann e682f7c0d6 Add tests for StreamPack 2022-03-21 21:15:03 +01:00
Michael Eischer bba8ba7a5b repository: cancel streampack context after error 2022-02-12 20:18:25 +01:00
Michael Eischer 47554a3428 repository: Fix error handling in repack
When storing a blob fails, this is a fatal error which must not be
retried.
2022-02-12 20:18:25 +01:00