restic

Commit Graph

Author	SHA1	Message	Date
Michael Eischer	c46edcd9d6	error strings should not end with punctuation	2020-09-05 10:07:17 +02:00
Michael Eischer	d0329cf3eb	Adjust comments to match name of exported methods	2020-09-05 10:07:16 +02:00
Michael Eischer	d9a80e07b9	repository: Simplify index age calculation	2020-09-05 10:07:16 +02:00
Alexander Weiss	b112533812	Don't save exact duplicates when merging indexes	2020-08-05 06:32:02 +02:00
Alexander Weiss	9d1fb94c6c	make Lookup() return all blobs + simplify syntax	2020-07-25 21:18:34 +02:00
Alexander Weiss	e388d962a5	Merge final indexes together for faster index access	2020-07-22 21:54:02 +02:00
greatroar	7bda28f31f	Chaining hash table for repository.Index These are faster to construct but slower to access. The allocation rate is halved, the peak memory usage almost halved compared to standard map. Benchmark results on linux/amd64, -benchtime=3s -count=20: name old time/op new time/op delta PackerManager-8 178ms ± 0% 178ms ± 0% ~ (p=0.231 n=20+20) DecodeIndex-8 4.54s ± 0% 4.30s ± 0% -5.20% (p=0.000 n=18+17) DecodeIndexParallel-8 4.54s ± 0% 4.30s ± 0% -5.22% (p=0.000 n=19+18) IndexHasUnknown-8 44.4ns ± 5% 50.5ns ±11% +13.82% (p=0.000 n=19+17) IndexHasKnown-8 48.3ns ± 0% 51.5ns ±12% +6.68% (p=0.001 n=16+20) IndexAlloc-8 758ms ± 1% 616ms ± 1% -18.69% (p=0.000 n=19+19) IndexAllocParallel-8 234ms ± 3% 204ms ± 2% -12.60% (p=0.000 n=20+18) MasterIndexLookupSingleIndex-8 122ns ± 0% 145ns ± 9% +18.44% (p=0.000 n=14+20) MasterIndexLookupMultipleIndex-8 369ns ± 2% 429ns ± 8% +16.27% (p=0.000 n=20+20) MasterIndexLookupSingleIndexUnknown-8 68.4ns ± 5% 74.9ns ±13% +9.47% (p=0.000 n=20+20) MasterIndexLookupMultipleIndexUnknown-8 315ns ± 3% 369ns ±11% +17.14% (p=0.000 n=20+20) MasterIndexLookupParallel/known,indices=5-8 743ns ± 1% 816ns ± 2% +9.87% (p=0.000 n=17+17) MasterIndexLookupParallel/unknown,indices=5-8 238ns ± 1% 260ns ± 2% +9.14% (p=0.000 n=19+20) MasterIndexLookupParallel/known,indices=10-8 1.01µs ± 3% 1.11µs ± 2% +9.79% (p=0.000 n=19+20) MasterIndexLookupParallel/unknown,indices=10-8 222ns ± 0% 269ns ± 2% +20.83% (p=0.000 n=16+20) MasterIndexLookupParallel/known,indices=20-8 1.06µs ± 2% 1.19µs ± 2% +12.95% (p=0.000 n=19+18) MasterIndexLookupParallel/unknown,indices=20-8 413ns ± 1% 530ns ± 1% +28.19% (p=0.000 n=18+20) SaveAndEncrypt-8 30.2ms ± 1% 30.4ms ± 0% +0.71% (p=0.000 n=19+19) LoadTree-8 540µs ± 1% 576µs ± 1% +6.73% (p=0.000 n=20+20) LoadBlob-8 5.64ms ± 0% 5.64ms ± 0% ~ (p=0.883 n=18+17) LoadAndDecrypt-8 5.93ms ± 0% 5.95ms ± 1% ~ (p=0.247 n=20+19) LoadIndex-8 25.1ms ± 0% 24.5ms ± 1% -2.54% (p=0.000 n=18+17) name old speed new speed delta PackerManager-8 296MB/s ± 0% 296MB/s ± 0% ~ (p=0.229 n=20+20) SaveAndEncrypt-8 139MB/s ± 1% 138MB/s ± 0% -0.71% (p=0.000 n=19+19) LoadBlob-8 177MB/s ± 0% 177MB/s ± 0% ~ (p=0.890 n=18+17) LoadAndDecrypt-8 169MB/s ± 0% 168MB/s ± 1% ~ (p=0.227 n=20+19) name old alloc/op new alloc/op delta PackerManager-8 91.8kB ± 0% 91.8kB ± 0% ~ (p=0.772 n=12+19) IndexAlloc-8 786MB ± 0% 400MB ± 0% -49.04% (p=0.000 n=20+18) IndexAllocParallel-8 786MB ± 0% 401MB ± 0% -49.04% (p=0.000 n=19+15) SaveAndEncrypt-8 21.0MB ± 0% 21.0MB ± 0% +0.00% (p=0.000 n=19+19) name old allocs/op new allocs/op delta PackerManager-8 1.41k ± 0% 1.41k ± 0% ~ (all equal) IndexAlloc-8 977k ± 0% 907k ± 0% -7.18% (p=0.000 n=20+20) IndexAllocParallel-8 977k ± 0% 907k ± 0% -7.17% (p=0.000 n=19+15) SaveAndEncrypt-8 73.0 ± 0% 73.0 ± 0% ~ (all equal)	2020-07-19 13:58:22 +02:00
Alexander Weiss	1361341c58	don't save duplicate packIDs when using internal/repository/Index.Store	2020-06-14 07:56:24 +02:00
Alexander Weiss	ce4a2f4ca6	save packIDs and duplicates separately A side remark to the definition of Index.blob: Another possibility would have been to use: blob map[restic.BlobHandle]indexEntry This would have led to the following sizes: key: 32 + 1 = 33 bytes value: 8 bytes indexEntry: 8 + 4 + 4 = 16 bytes each packID: 32 bytes To save N index entries, we would therefore have needed: N OF * (33 + 8) bytes + N * 16 + N * 32 bytes / BP = N * 82 bytes More precicely, using a pointer instead of a direct entry is the better memory choice if: OF * 8 bytes + entrysize < OF * entrysize <=> entrysize > 8 bytes * OF/(OF-1) Under the assumption of OF=1.5, this means using pointers would have been the better choice if sizeof(indexEntry) > 24 bytes.	2020-06-14 07:56:21 +02:00
Alexander Weiss	cf979e2b81	make offset and length uint32	2020-06-14 07:50:19 +02:00
Michael Eischer	d92e2c5769	simplify index code	2020-06-14 07:50:19 +02:00
Alexander Weiss	7419844885	add changelog, benchmark, memory calculation	2020-06-14 07:50:15 +02:00
MichaelEischer	735a8074d5	Merge pull request #2773 from aawsome/index-uploads+knownblobs Fix non-intuitive repo behavior	2020-06-12 22:41:04 +02:00
Alexander Weiss	91906911b0	Fix non-intuitive repository behavior - The SaveBlob method now checks for duplicates. - Moves handling of pending blobs to MasterIndex. -> also cleans up pending index entries when they are saved in the index -> when using SaveBlob no need to care about index any longer - Always check for full index and save it when storing packs. -> removes the need of an index uploader -> also removes the verbose "uploaded intermediate index" messages - The Flush method now also saves the index - Fix race condition when checking and saving full/non-finalized indexes	2020-06-11 13:05:23 +02:00
Alexander Weiss	8c1261ff02	changed condition for full index	2020-06-07 22:00:49 +02:00
Alexander Neumann	66efa425bf	Reuse buffer in worker functions	2019-04-13 13:38:39 +02:00
Alexander Neumann	d51e9d1b98	Add []byte to repo.LoadAndDecrypt and utils.LoadAll This commit changes the signatures for repository.LoadAndDecrypt and utils.LoadAll to allow passing in a []byte as the buffer to use. This buffer is enlarged as needed, and returned back to the caller for further use. In later commits, this allows reducing allocations by reusing a buffer for multiple calls, e.g. in a worker function.	2019-04-13 13:38:39 +02:00
Alexander Neumann	663c57ab4d	debug: Remove manual Str() call Log()	2018-01-25 20:49:41 +01:00
Matthew Dawson	3789e55e20	repostiory/index: Remove logging from Lookup function. The logging in these functions double the time they take to execute. However, it is only really useful on failures, which are better reported by the calling functions. benchmark old ns/op new ns/op delta BenchmarkMasterIndexLookupSingleIndex-6 897 395 -55.96% BenchmarkMasterIndexLookupMultipleIndex-6 2001 1090 -45.53% BenchmarkMasterIndexLookupSingleIndexUnknown-6 492 215 -56.30% BenchmarkMasterIndexLookupMultipleIndexUnknown-6 1649 912 -44.69% benchmark old allocs new allocs delta BenchmarkMasterIndexLookupSingleIndex-6 9 1 -88.89% BenchmarkMasterIndexLookupMultipleIndex-6 19 1 -94.74% BenchmarkMasterIndexLookupSingleIndexUnknown-6 6 0 -100.00% BenchmarkMasterIndexLookupMultipleIndexUnknown-6 16 0 -100.00% benchmark old bytes new bytes delta BenchmarkMasterIndexLookupSingleIndex-6 160 96 -40.00% BenchmarkMasterIndexLookupMultipleIndex-6 240 96 -60.00% BenchmarkMasterIndexLookupSingleIndexUnknown-6 48 0 -100.00% BenchmarkMasterIndexLookupMultipleIndexUnknown-6 128 0 -100.00%	2018-01-23 22:28:38 -05:00
Matthew Dawson	df2c03a6a4	repository/master_index: Optimize Index.Lookup() When looking up a blob in the master index, with several indexes present in the master index, a significant amount of time is spent generating errors for each failed lookup. However, these errors are often used to check if a blob is present, but the contents are not inspected making the overhead of the error not useful. Instead, change Index.Lookup (and Index.LookupSize) to instead return a boolean denoting if the blob was found instead of an error. Also change all the calls to these functions to handle the new function signature. benchmark old ns/op new ns/op delta BenchmarkMasterIndexLookupSingleIndex-6 820 897 +9.39% BenchmarkMasterIndexLookupMultipleIndex-6 12821 2001 -84.39% BenchmarkMasterIndexLookupSingleIndexUnknown-6 5378 492 -90.85% BenchmarkMasterIndexLookupMultipleIndexUnknown-6 17026 1649 -90.31% benchmark old allocs new allocs delta BenchmarkMasterIndexLookupSingleIndex-6 9 9 +0.00% BenchmarkMasterIndexLookupMultipleIndex-6 59 19 -67.80% BenchmarkMasterIndexLookupSingleIndexUnknown-6 22 6 -72.73% BenchmarkMasterIndexLookupMultipleIndexUnknown-6 72 16 -77.78% benchmark old bytes new bytes delta BenchmarkMasterIndexLookupSingleIndex-6 160 160 +0.00% BenchmarkMasterIndexLookupMultipleIndex-6 3200 240 -92.50% BenchmarkMasterIndexLookupSingleIndexUnknown-6 1232 48 -96.10% BenchmarkMasterIndexLookupMultipleIndexUnknown-6 4272 128 -97.00%	2018-01-23 22:25:56 -05:00
Matthew Dawson	539599d1f1	repository/index: Optimize index.Has() When backing up several million files (>14M tested here) with few changes, a large amount of time is spent failing to find an id in an index and creating an error to signify this. Since this is checked using the Has method, which doesn't use this error, this time creating the error is wasted. Instead, directly check if the given id and type are present in the index. This also avoids reporting all the packs containing this blob, further reducing cpu usage.	2018-01-08 21:46:17 +01:00
Alexander Neumann	1eaad6cebb	index: Add TreePacks()	2017-09-24 21:54:53 +02:00
Alexander Neumann	23c903074c	Move restic package to internal/restic	2017-07-24 17:43:32 +02:00
Alexander Neumann	6caeff2408	Run goimports	2017-07-23 14:21:03 +02:00
Alexander Neumann	83d1a46526	Moves files	2017-07-23 14:19:13 +02:00

25 Commits