mirror of
https://github.com/borgbackup/borg.git
synced 2025-01-20 14:29:25 +00:00
be clear about what buzhash is used for, fixes #2390
and want it is not used for (deduplication). also say already in the readme that we use a cryptohash for dedupe, so people don't worry.
This commit is contained in:
parent
6f47b797f9
commit
bf69b049e9
2 changed files with 12 additions and 0 deletions
|
@ -27,6 +27,10 @@ Main features
|
||||||
of bytes stored: each file is split into a number of variable length chunks
|
of bytes stored: each file is split into a number of variable length chunks
|
||||||
and only chunks that have never been seen before are added to the repository.
|
and only chunks that have never been seen before are added to the repository.
|
||||||
|
|
||||||
|
A chunk is considered duplicate if its id_hash value is identical.
|
||||||
|
A cryptographically strong hash or MAC function is used as id_hash, e.g.
|
||||||
|
(hmac-)sha256.
|
||||||
|
|
||||||
To deduplicate, all the chunks in the same repository are considered, no
|
To deduplicate, all the chunks in the same repository are considered, no
|
||||||
matter whether they come from different machines, from previous backups,
|
matter whether they come from different machines, from previous backups,
|
||||||
from the same backup or even from the same single file.
|
from the same backup or even from the same single file.
|
||||||
|
|
|
@ -69,6 +69,9 @@ Normally the keys are computed like this::
|
||||||
|
|
||||||
The id_hash function depends on the :ref:`encryption mode <borg_init>`.
|
The id_hash function depends on the :ref:`encryption mode <borg_init>`.
|
||||||
|
|
||||||
|
As the id / key is used for deduplication, id_hash must be a cryptographically
|
||||||
|
strong hash or MAC.
|
||||||
|
|
||||||
Segments
|
Segments
|
||||||
~~~~~~~~
|
~~~~~~~~
|
||||||
|
|
||||||
|
@ -243,6 +246,11 @@ The |project_name| chunker uses a rolling hash computed by the Buzhash_ algorith
|
||||||
It triggers (chunks) when the last HASH_MASK_BITS bits of the hash are zero,
|
It triggers (chunks) when the last HASH_MASK_BITS bits of the hash are zero,
|
||||||
producing chunks of 2^HASH_MASK_BITS Bytes on average.
|
producing chunks of 2^HASH_MASK_BITS Bytes on average.
|
||||||
|
|
||||||
|
Buzhash is **only** used for cutting the chunks at places defined by the
|
||||||
|
content, the buzhash value is **not** used as the deduplication criteria (we
|
||||||
|
use a cryptographically strong hash/MAC over the chunk contents for this, the
|
||||||
|
id_hash).
|
||||||
|
|
||||||
``borg create --chunker-params CHUNK_MIN_EXP,CHUNK_MAX_EXP,HASH_MASK_BITS,HASH_WINDOW_SIZE``
|
``borg create --chunker-params CHUNK_MIN_EXP,CHUNK_MAX_EXP,HASH_MASK_BITS,HASH_WINDOW_SIZE``
|
||||||
can be used to tune the chunker parameters, the default is:
|
can be used to tune the chunker parameters, the default is:
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue