Commit Graph

54 Commits

Author SHA1 Message Date
greatroar c4e2203e45 Check error in archiver before calling Select
The archiver first called the Select function for a path before checking
whether the Lstat on that path actually worked. The RejectFuncs in
exclude.go worked around this by checking whether they received a nil
os.FileInfo. Checking first is more obvious and requires less code.
2020-10-05 11:11:04 +02:00
Michael Eischer dc31529fc3 Unindent else block after if block ending with a return statement 2020-09-05 10:07:16 +02:00
Michael Eischer b25978a53c backup: Fix reporting of directory count in summary
Previously the directory stats were reported immediately after calling
`SaveDir`. However, as the latter method saves the tree asynchronously
the stats were still initialized to their nil value. The stats are now
reported via a callback similar to the one used for the fileSaver.
2020-08-27 22:43:51 +02:00
Alexander Weiss 9175795fdb Check contents in archiver
When backing up with a parent snapshot and the file is not changed, also
check if contents are still available in index.
2020-07-25 08:18:28 +02:00
Alexander Weiss 91906911b0 Fix non-intuitive repository behavior
- The SaveBlob method now checks for duplicates.
- Moves handling of pending blobs to MasterIndex.
  -> also cleans up pending index entries when they are saved in the index
  -> when using SaveBlob no need to care about index any longer
- Always check for full index and save it when storing packs.
  -> removes the need of an index uploader
  -> also removes the verbose "uploaded intermediate index" messages
- The Flush method now also saves the index
- Fix race condition when checking and saving full/non-finalized indexes
2020-06-11 13:05:23 +02:00
Martin Michlmayr 5cc1760fdf
Fix typos 2020-05-16 14:05:26 +08:00
Alexander Neumann 95da6c1c1d
Merge pull request #2589 from greatroar/no-stable-sort
Replace sort.Stable by sort.Strings
2020-03-01 19:40:28 +01:00
greatroar 79b882e901 Merge duplicated readdir functionality
internal/archiver.readdir and internal/fs.ReadDir were unused.

internal/fs.ReadDirNames and internal/archiver.readdirnames were doing
nearly the same thing, except one sorted its output and opened with
fs.O_NOFOLLOW. Both were only used in internal/archiver.
2020-02-26 11:05:38 +01:00
greatroar 3a6feb0596 Replace sort.Stable by sort.Strings
Calling the slow, O(n lg² n) sort.Stable is equivalent to sort.Strings
for a slice of unique strings.
2020-02-18 19:41:06 +01:00
greatroar 2f8aa2ce30 Remove unused fs.FS from archiver.FileSaver 2020-02-18 10:39:14 +01:00
rawtaz d8da9c4401
Merge pull request #2577 from alrs/fix-internal-errs
internal: Fix code and test dropped errors
2020-02-12 23:41:54 +01:00
Lars Lehtonen 72734d59b5
internal/archiver: fix dropped error 2020-02-12 13:37:37 -08:00
Michael Eischer a135699397 Close file if file type has changed after initial stat 2020-02-11 21:09:47 +01:00
Alexander Neumann 919dd2ac84 Merge pull request #2252 from restic/fix-2249
Read fresh metadata for unmodified files
2019-04-25 09:15:50 +02:00
Courtney Bane 35b7607802 Don't check ctime when ignoring inode. 2019-04-24 20:53:08 -05:00
Alexander Neumann 389067fb8b Only use list of blobs for old node
Closes #2249
2019-04-24 15:07:26 +02:00
Courtney Bane b8c2544dcb Examine file ctime when checking if files have changed. 2019-04-23 21:54:35 -05:00
Alexander Neumann 65b476ead9 Fix gofmt 2019-03-16 13:29:05 +01:00
Heiko Bornholdt db8f5864fc Add --ignore-inode option to backup cmd
revised version of https://github.com/restic/restic/pull/2047
2019-03-10 21:24:29 +01:00
Andreas Skielboe b07bb3d8c3 Reject files excluded by name before calling lstat to improve scan speed
Adds a SelectByName method to the archive and scanner which only require
the filename as input, and can thus be run before calling lstat on the
file. Can speed up scanning significantly if a lot of filename excludes
are used.
2018-08-12 17:51:12 +02:00
Alexander Neumann adb682bc43 archiver: Don't open files with O_NONBLOCK
This is not necessary any more, we're doing an lstat() before opening
an item, so we already known it's a file and not a pipe.
2018-05-20 16:11:51 +02:00
Alexander Neumann 1e9744c9a4 archiver: Refuse to save an empty snapshot 2018-05-20 16:11:51 +02:00
Alexander Neumann 347a645450 Fix double error message 2018-05-15 11:03:33 +02:00
Alexander Neumann 60ea2435be Improve error message for readdir/readdirnames
As mentioned in the forum[1], restic does not include the dir name when
readdir/readdirnames fails.

[1] https://forum.restic.net/t/readdirnames-readdirent-no-such-file-or-directory/653
2018-05-13 10:34:50 +02:00
Alexander Neumann e43c9202a6 archiver: Make sure backend error is passed up 2018-05-12 23:55:59 +02:00
Alexander Neumann c5e75d1c98 archiver: Add test for early abort on unhandled error 2018-05-12 23:55:59 +02:00
Alexander Neumann 526956af35 archiver: Read files/dirs in order 2018-05-12 23:55:54 +02:00
Alexander Neumann 256104111d archiver: Clarify names 2018-05-12 23:55:54 +02:00
Alexander Neumann 21c83b1725 archiver: Add high-level documentation 2018-05-12 23:55:54 +02:00
Alexander Neumann 581c62ee72 archiver: Improve error handling
This commit changes how the worker goroutines for saving e.g. blobs
interact. Before, it was possible to get stuck sending an instruction to
archive a file or dir when no worker goroutines were available any more.
This commit introduces a `done` channel for each of the worker pools,
which is set to the channel returned by `tomb.Dying()`, so it is closed
when the first worker returned an error.
2018-05-12 23:55:54 +02:00
Alexander Neumann ca4af43c03 archiver: Return low-level errors
This commit changes the archiver so that low-level errors saving data to
the repo are returned to the caller (instead of being handled by the
error callback function). This correctly bubbles up errors like a full
temp file system and makes restic abort early and makes all other worker
goroutines exit.
2018-05-10 21:30:09 +02:00
Alexander Neumann 2218ecd049 archiver: Use lstat before open/fstat
The previous code tried to be as efficient as possible and only do a
single open() on an item to save, and then fstat() on the fd to find out
what the item is (file, dir, other). For normal files, it would then
start reading the data without opening the file again, so it could not
be exchanged for e.g. a symlink.

This behavior starts the watchdog on my machine when /dev is saved
with restic, and after a few seconds, the machine reboots.

This commit reverts the behavior to the strategy the old archiver code
used: run lstat(), then decide what to do. For normal files, open the
file and then run fstat() on the fd to verify it's still a normal file,
then start reading the data.

The downside is that for normal files we now do two stat() calls
(lstat+fstat) instead of only one. On the upside, this does not start
the watchdog. :)
2018-05-01 23:05:50 +02:00
Alexander Neumann c83c03ed63 archiver: Fix blocking on pipes 2018-04-30 15:34:58 +02:00
Alexander Neumann 4e34325035 archiver: Process dirs concurrently 2018-04-30 15:13:28 +02:00
Alexander Neumann 78bd591c7c archiver: Improve buffer pool 2018-04-30 15:13:28 +02:00
Alexander Neumann 400730afca archiver: Improve memory usage, tune buffer pool 2018-04-30 14:19:07 +02:00
Alexander Neumann f279731168 Add new archiver code 2018-04-25 14:42:45 +02:00
Alexander Neumann fd12a3af20 Remove old archiver code 2018-04-23 21:40:33 +02:00
Alexander Neumann 4e0b2a8e3a snapshot: correct error handling for filepath.Abs 2018-04-22 11:37:05 +02:00
Alexander Neumann f99c95c766 archiver: Fix intermediate index upload
A user discovered[1] that when the backup finishes during the upload of
an intermediate index, the upload is cancelled and the index never fully
saved, but the snapshot is saved and the backup finalizes without an
error. This lead to a situation where a snapshot references data that is
contained in the repo, but not referenced in any index, leading to
strange error messages.

This commit uses a dedicated context to signal the intermediate index
uploading routine to terminate after the last index has been uploaded.
This way, an upload running when the backup finishes is completed before
the routine terminates and the snapshot is saved.

[1] https://forum.restic.net/t/error-loading-tree-check-prune-and-forget-gives-error-b2-backend/406
2018-01-26 22:01:07 +01:00
Alexander Neumann 663c57ab4d debug: Remove manual Str() call Log() 2018-01-25 20:49:41 +01:00
Matthew Dawson 3a16148447
archiver/archiver: Use Index.Has() instead of Index.Lookup() in isKnownBlob
Index.Has() is a faster then Index.Lookup() for checking if a blob exists
in the index.  As the returned data is never used, this avoids a ton
of allocations.
2018-01-23 22:26:10 -05:00
Matthew Dawson df2c03a6a4
repository/master_index: Optimize Index.Lookup()
When looking up a blob in the master index, with several
indexes present in the master index, a significant amount of time
is spent generating errors for each failed lookup.  However, these
errors are often used to check if a blob is present, but the contents
are not inspected making the overhead of the error not useful.

Instead, change Index.Lookup (and Index.LookupSize) to instead return
a boolean denoting if the blob was found instead of an error.  Also change
all the calls to these functions to handle the new function signature.

benchmark                                            old ns/op     new ns/op     delta
BenchmarkMasterIndexLookupSingleIndex-6              820           897           +9.39%
BenchmarkMasterIndexLookupMultipleIndex-6            12821         2001          -84.39%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       5378          492           -90.85%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     17026         1649          -90.31%

benchmark                                            old allocs     new allocs     delta
BenchmarkMasterIndexLookupSingleIndex-6              9              9              +0.00%
BenchmarkMasterIndexLookupMultipleIndex-6            59             19             -67.80%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       22             6              -72.73%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     72             16             -77.78%

benchmark                                            old bytes     new bytes     delta
BenchmarkMasterIndexLookupSingleIndex-6              160           160           +0.00%
BenchmarkMasterIndexLookupMultipleIndex-6            3200          240           -92.50%
BenchmarkMasterIndexLookupSingleIndexUnknown-6       1232          48            -96.10%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6     4272          128           -97.00%
2018-01-23 22:25:56 -05:00
George Armhold d886cb5c27 replace ad-hoc context.TODO() with gopts.ctx, so that cancellation
can properly trickle down from cmd_*.

gh-1434
2017-12-03 07:22:14 -05:00
Alexander Neumann eddb8549ef backup: By default, do not save the access time
This can be re-enabled with `--with-atime`.
2017-11-28 21:31:35 +01:00
Alexander Neumann ce180de9b8 Merge pull request #1243 from restic/improve-error-reporting
Improve error reporting
2017-09-16 14:54:30 +02:00
Alexander Neumann d4e994de7b Improve error reporting
This will print the error (including a stack trace) if available before
exiting.
2017-09-16 10:55:13 +02:00
Alexander Neumann a60e751217 Use .Equal() instead of == for time.Time
Closes #1238
2017-09-15 20:57:35 +02:00
Tobias Klein 43ff971dfd new sub-option for backup: time
New option to specify the timestamp for a backup
2017-09-09 13:26:35 +02:00
Alexander Neumann 83eb075e3a Resolve name collisions
At the moment when two items to be saved have the same directory name,
restic only saves the first one to the repo. Let's say we have a
structure like this:

    dir1
    └── subdir
        └── file
    dir2
    └── subdir
        └── file

When restic is run on `dir1/subdir` and `dir2/subdir`, it will only save
the first `subdir`:

    $ restic backup dir1/subdir dir2/subdir
    [...]

    $ restic ls -l latest
    drwxr-xr-x  1000   100      0 2017-08-27 20:56:39 /subdir
    -rw-r--r--  1000   100     17 2017-08-27 20:56:39 /subdir/file

That's obviously a bad thing, caused by an early decision to strip the
full path to the files/dirs to save and only leave the last directory.

This commit partly resolves this by handling colliding names and
resolving the conflicts. Restic will now append a counter to the file
(`-123`) until the conflict is resolved. So in the example above, we'll
end up with the following structure:

    $ restic ls -l latest
    drwxr-xr-x  1000   100      0 2017-08-27 20:56:39 /subdir
    -rw-r--r--  1000   100     17 2017-08-27 20:56:39 /subdir/file
    drwxr-xr-x  1000   100      0 2017-08-27 20:56:46 /subdir-1
    -rw-r--r--  1000   100     17 2017-08-27 20:56:46 /subdir-1/file

This partly addresses #549 and closes #1179.

At first I thought that the obvious correction would be to archive the
full path. But it turns out that collisions may still occur: Suppose you
have a file named `foo` in the current directory, and the parent directory
also contains a file `foo`. Archiving these with restic also causes a
collision, since restic strips the `../` from the first file:

    $ restic backup ../foo foo

This also happens with `tar`, which does not handle the collision and
will happily archive two files called `foo`.

So, the best way forward is to handle name collisions and archive the
whole path. The latter will be tackled in a separate PR.
2017-09-05 21:47:02 +02:00