Index.Has() is a faster then Index.Lookup() for checking if a blob exists
in the index. As the returned data is never used, this avoids a ton
of allocations.
When looking up a blob in the master index, with several
indexes present in the master index, a significant amount of time
is spent generating errors for each failed lookup. However, these
errors are often used to check if a blob is present, but the contents
are not inspected making the overhead of the error not useful.
Instead, change Index.Lookup (and Index.LookupSize) to instead return
a boolean denoting if the blob was found instead of an error. Also change
all the calls to these functions to handle the new function signature.
benchmark old ns/op new ns/op delta
BenchmarkMasterIndexLookupSingleIndex-6 820 897 +9.39%
BenchmarkMasterIndexLookupMultipleIndex-6 12821 2001 -84.39%
BenchmarkMasterIndexLookupSingleIndexUnknown-6 5378 492 -90.85%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6 17026 1649 -90.31%
benchmark old allocs new allocs delta
BenchmarkMasterIndexLookupSingleIndex-6 9 9 +0.00%
BenchmarkMasterIndexLookupMultipleIndex-6 59 19 -67.80%
BenchmarkMasterIndexLookupSingleIndexUnknown-6 22 6 -72.73%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6 72 16 -77.78%
benchmark old bytes new bytes delta
BenchmarkMasterIndexLookupSingleIndex-6 160 160 +0.00%
BenchmarkMasterIndexLookupMultipleIndex-6 3200 240 -92.50%
BenchmarkMasterIndexLookupSingleIndexUnknown-6 1232 48 -96.10%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6 4272 128 -97.00%
At the moment when two items to be saved have the same directory name,
restic only saves the first one to the repo. Let's say we have a
structure like this:
dir1
└── subdir
└── file
dir2
└── subdir
└── file
When restic is run on `dir1/subdir` and `dir2/subdir`, it will only save
the first `subdir`:
$ restic backup dir1/subdir dir2/subdir
[...]
$ restic ls -l latest
drwxr-xr-x 1000 100 0 2017-08-27 20:56:39 /subdir
-rw-r--r-- 1000 100 17 2017-08-27 20:56:39 /subdir/file
That's obviously a bad thing, caused by an early decision to strip the
full path to the files/dirs to save and only leave the last directory.
This commit partly resolves this by handling colliding names and
resolving the conflicts. Restic will now append a counter to the file
(`-123`) until the conflict is resolved. So in the example above, we'll
end up with the following structure:
$ restic ls -l latest
drwxr-xr-x 1000 100 0 2017-08-27 20:56:39 /subdir
-rw-r--r-- 1000 100 17 2017-08-27 20:56:39 /subdir/file
drwxr-xr-x 1000 100 0 2017-08-27 20:56:46 /subdir-1
-rw-r--r-- 1000 100 17 2017-08-27 20:56:46 /subdir-1/file
This partly addresses #549 and closes#1179.
At first I thought that the obvious correction would be to archive the
full path. But it turns out that collisions may still occur: Suppose you
have a file named `foo` in the current directory, and the parent directory
also contains a file `foo`. Archiving these with restic also causes a
collision, since restic strips the `../` from the first file:
$ restic backup ../foo foo
This also happens with `tar`, which does not handle the collision and
will happily archive two files called `foo`.
So, the best way forward is to handle name collisions and archive the
whole path. The latter will be tackled in a separate PR.