From dd1023df44db2f348f59991432e61ce4db507dbf Mon Sep 17 00:00:00 2001 From: Manu Date: Wed, 9 Mar 2022 14:06:40 +0400 Subject: [PATCH] Add discussion and note on file attrs. --- docs/faq.rst | 58 ++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 49 insertions(+), 9 deletions(-) diff --git a/docs/faq.rst b/docs/faq.rst index f678c3d83..020e11ffd 100644 --- a/docs/faq.rst +++ b/docs/faq.rst @@ -880,13 +880,30 @@ If you run into that, try this: What's the expected backup performance? --------------------------------------- -A first backup will usually be somehow "slow" because there is a lot of data -to process. Performance here depends on a lot of factors, so it is hard to -give specific numbers. +Compared to simply copying files (e.g. with ``rsync``), Borg has more work to do. +This can make creation of the first archive slower, but saves time +and disk space on subsequent runs. Here what Borg does when you run ``borg create``: + +- Borg chunks the file (using the relatively expensive buzhash algorithm) +- It then computes the "id" of the chunk (hmac-sha256 (often slow, except + if your CPU has sha256 acceleration) or blake2b (fast, in software)) +- Then it checks whether this chunk is already in the repo (local hashtable lookup, + fast). If so, the processing of the chunk is completed here. Otherwise it needs to + process the chunk: +- Compresses the chunks (the default lz4 is super fast) +- Encrypts (AES, usually fast if your CPU has AES acceleration as usual + since about 10y) +- Authenticates ("signs") using hmac-sha256 or blake2b (see above), +- Transmits to repo (remote repo), usually involving an SSH connection + (does its own encryption) +- Stores the chunk into a key/value store (the key is the chunk id, the value + is the data). While doing that, it computes a CRC32 of the data (repo low-level + checksum, used by borg check --repository) and also updates the repo index + (another hashtable). Subsequent backups are usually very fast if most files are unchanged and only a few are new or modified. The high performance on unchanged files primarily depends -only on a few factors (like fs recursion + metadata reading performance and the +only on a few factors (like FS recursion + metadata reading performance and the files cache working as expected) and much less on other factors. E.g., for this setup: @@ -904,14 +921,37 @@ few FAQ entries below. .. _slow_backup: -Why is backup slow for me? +Why is my backup so slow? -------------------------- -So, if you feel your Borg backup is too slow somehow, you should find out why. +If you feel your Borg backup is too slow somehow, here is what you can do: -The usual way to approach this is to add ``--list --filter=AME --stats`` to your -``borg create`` call to produce more log output, including a file list (with file status -characters) and also some statistics at the end of the backup. +- Make sure Borg has enough RAM (depends on how big your repo is / how many + files you have) +- Use one of the blake2 modes for --encryption except if you positively know + your CPU (and openssl) accelerates sha256 (then stay with hmac-sha256). +- Don't use any expensive compression. The default is lz4 and super fast. + Uncompressed is often slower than lz4. +- Just wait. You can also interrupt it and start it again as often as you like, + it will converge against a valid "completed" state (see ``--checkpoint-interval``, + maybe use the default, but in any case don't make it too short). It is starting + from the beginning each time, but it is still faster then as it does not store + data into the repo which it already has there from last checkpoint. +- If you don’t need additional file attributes, you can disable them with ``--noflags``, + ``--noacls``, ``--noxattrs``. This can lead to noticable performance improvements + when your backup consists of many small files. + +If you feel that Borg "freezes" on a file, it could be in the middle of processing a +large file (like ISOs or VM images). Borg < 1.2 announces file names *after* finishing +with the file. This can lead to displaying the name of a small file, while processing the +next (larger) file. For very big files this can lead to the progress display show some +previous short file for a long time while it processes the big one. With Borg 1.2 this +was changed to announcing the filename before starting to process it. + +To see what files have changed and take more time processing, you can also add +``--list --filter=AME --stats`` to your ``borg create`` call to produce more log output, +including a file list (with file status characters) and also some statistics at +the end of the backup. Then you do the backup and look at the log output: