docs: how to approach borg speed issues, fixes #5371

also: give a speed example for unchanged files

also: explicitly mention chmod/chown changing ctime

also: reformat one paragraph
This commit is contained in:
Thomas Waldmann 2020-10-03 22:55:14 +02:00 committed by Milkey Mouse
parent 0e8e614ea9
commit 60ee199a0f
2 changed files with 62 additions and 0 deletions

View File

@ -798,6 +798,67 @@ If you run into that, try this:
the parent directory (or even everything) the parent directory (or even everything)
- mount the repo using FUSE and use some file manager - mount the repo using FUSE and use some file manager
.. _expected_performance:
What's the expected backup performance?
---------------------------------------
A first backup will usually be somehow "slow" because there is a lot of data
to process. Performance here depends on a lot of factors, so it is hard to
give specific numbers.
Subsequent backups are usually very fast if most files are unchanged and only
a few are new or modified. The high performance on unchanged files primarily depends
only on a few factors (like fs recursion + metadata reading performance and the
files cache working as expected) and much less on other factors.
E.g., for this setup:
- server grade machine (4C/8T 2013 Xeon, 64GB RAM, 2x good 7200RPM disks)
- local zfs filesystem (mirrored) containing the backup source data
- repository is remote (does not matter much for unchanged files)
- backup job runs while machine is otherwise idle
The observed performance is that |project_name| can process about
**1 million unchanged files (and a few small changed ones) in 4 minutes!**
If you are seeing much less than that in similar circumstances, read the next
few FAQ entries below.
.. _slow_backup:
Why is backup slow for me?
--------------------------
So, if you feel your |project_name| backup is too slow somehow, you should find out why.
The usual way to approach this is to add ``--list --filter=AME --stats`` to your
``borg create`` call to produce more log output, including a file list (with file status
characters) and also some statistics at the end of the backup.
Then you do the backup and look at the log output:
- stats: Do you really have little changes or are there more changes than you thought?
In the stats you can see the overall volume of changed data, which needed to be
added to the repo. If that is a lot, that can be the reason why it is slow.
- ``A`` status ("added") in the file list:
If you see that often, you have a lot of new files (files that |project_name| did not find
in the files cache). If you think there is something wrong with that (the file was there
already in the previous backup), please read the FAQ entries below.
- ``M`` status ("modified") in the file list:
If you see that often, |project_name| thinks that a lot of your files might be modified
(|project_name| found them in the files cache, but the metadata read from the filesystem did
not match the metadata stored in the files cache).
In such a case, |project_name| will need to process the files' contents completely, which is
much slower than processing unmodified files (|project_name| does not read their contents!).
The metadata values used in this comparison are determined by the ``--files-cache`` option
and could be e.g. size, ctime and inode number (see the ``borg create`` docs for more
details and potential issues).
You can use the ``stat`` command on files to manually look at fs metadata to debug if
there is any unexpected change triggering the ``M`` status.
See also the next few FAQ entries for more details.
.. _a_status_oddity: .. _a_status_oddity:
I am seeing 'A' (added) status for an unchanged file!? I am seeing 'A' (added) status for an unchanged file!?

View File

@ -3079,6 +3079,7 @@ class Archiver:
as it can not be set from userspace. But, a metadata-only change will already as it can not be set from userspace. But, a metadata-only change will already
update the ctime, so there might be some unnecessary chunking/hashing even update the ctime, so there might be some unnecessary chunking/hashing even
without content changes. Some filesystems do not support ctime (change time). without content changes. Some filesystems do not support ctime (change time).
E.g. doing a chown or chmod to a file will change its ctime.
- mtime usually works and only updates if file contents were changed. But mtime - mtime usually works and only updates if file contents were changed. But mtime
can be arbitrarily set from userspace, e.g. to set mtime back to the same value can be arbitrarily set from userspace, e.g. to set mtime back to the same value
it had before a content change happened. This can be used maliciously as well as it had before a content change happened. This can be used maliciously as well as