mirror of https://github.com/borgbackup/borg.git
be clear about what buzhash is used for, fixes #2390
and want it is not used for (deduplication).
also say already in the readme that we use a cryptohash
for dedupe, so people don't worry.
(cherry picked from commit bf69b049e9
)
This commit is contained in:
parent
a87f35b0cc
commit
0de22594a6
|
@ -27,6 +27,10 @@ Main features
|
|||
of bytes stored: each file is split into a number of variable length chunks
|
||||
and only chunks that have never been seen before are added to the repository.
|
||||
|
||||
A chunk is considered duplicate if its id_hash value is identical.
|
||||
A cryptographically strong hash or MAC function is used as id_hash, e.g.
|
||||
(hmac-)sha256.
|
||||
|
||||
To deduplicate, all the chunks in the same repository are considered, no
|
||||
matter whether they come from different machines, from previous backups,
|
||||
from the same backup or even from the same single file.
|
||||
|
|
|
@ -96,6 +96,8 @@ The id_hash function is:
|
|||
* sha256 (no encryption keys available)
|
||||
* hmac-sha256 (encryption keys available)
|
||||
|
||||
As the id / key is used for deduplication, id_hash must be a cryptographically
|
||||
strong hash or MAC.
|
||||
|
||||
Segments and archives
|
||||
---------------------
|
||||
|
@ -233,6 +235,11 @@ The |project_name| chunker uses a rolling hash computed by the Buzhash_ algorith
|
|||
It triggers (chunks) when the last HASH_MASK_BITS bits of the hash are zero,
|
||||
producing chunks of 2^HASH_MASK_BITS Bytes on average.
|
||||
|
||||
Buzhash is **only** used for cutting the chunks at places defined by the
|
||||
content, the buzhash value is **not** used as the deduplication criteria (we
|
||||
use a cryptographically strong hash/MAC over the chunk contents for this, the
|
||||
id_hash).
|
||||
|
||||
``borg create --chunker-params CHUNK_MIN_EXP,CHUNK_MAX_EXP,HASH_MASK_BITS,HASH_WINDOW_SIZE``
|
||||
can be used to tune the chunker parameters, the default is:
|
||||
|
||||
|
|
Loading…
Reference in New Issue