mirror of https://github.com/borgbackup/borg.git
be clear about what buzhash is used for, fixes #2390
and want it is not used for (deduplication).
also say already in the readme that we use a cryptohash
for dedupe, so people don't worry.
(cherry picked from commit bf69b049e9
)
This commit is contained in:
parent
a87f35b0cc
commit
0de22594a6
|
@ -27,6 +27,10 @@ Main features
|
||||||
of bytes stored: each file is split into a number of variable length chunks
|
of bytes stored: each file is split into a number of variable length chunks
|
||||||
and only chunks that have never been seen before are added to the repository.
|
and only chunks that have never been seen before are added to the repository.
|
||||||
|
|
||||||
|
A chunk is considered duplicate if its id_hash value is identical.
|
||||||
|
A cryptographically strong hash or MAC function is used as id_hash, e.g.
|
||||||
|
(hmac-)sha256.
|
||||||
|
|
||||||
To deduplicate, all the chunks in the same repository are considered, no
|
To deduplicate, all the chunks in the same repository are considered, no
|
||||||
matter whether they come from different machines, from previous backups,
|
matter whether they come from different machines, from previous backups,
|
||||||
from the same backup or even from the same single file.
|
from the same backup or even from the same single file.
|
||||||
|
|
|
@ -96,6 +96,8 @@ The id_hash function is:
|
||||||
* sha256 (no encryption keys available)
|
* sha256 (no encryption keys available)
|
||||||
* hmac-sha256 (encryption keys available)
|
* hmac-sha256 (encryption keys available)
|
||||||
|
|
||||||
|
As the id / key is used for deduplication, id_hash must be a cryptographically
|
||||||
|
strong hash or MAC.
|
||||||
|
|
||||||
Segments and archives
|
Segments and archives
|
||||||
---------------------
|
---------------------
|
||||||
|
@ -233,6 +235,11 @@ The |project_name| chunker uses a rolling hash computed by the Buzhash_ algorith
|
||||||
It triggers (chunks) when the last HASH_MASK_BITS bits of the hash are zero,
|
It triggers (chunks) when the last HASH_MASK_BITS bits of the hash are zero,
|
||||||
producing chunks of 2^HASH_MASK_BITS Bytes on average.
|
producing chunks of 2^HASH_MASK_BITS Bytes on average.
|
||||||
|
|
||||||
|
Buzhash is **only** used for cutting the chunks at places defined by the
|
||||||
|
content, the buzhash value is **not** used as the deduplication criteria (we
|
||||||
|
use a cryptographically strong hash/MAC over the chunk contents for this, the
|
||||||
|
id_hash).
|
||||||
|
|
||||||
``borg create --chunker-params CHUNK_MIN_EXP,CHUNK_MAX_EXP,HASH_MASK_BITS,HASH_WINDOW_SIZE``
|
``borg create --chunker-params CHUNK_MIN_EXP,CHUNK_MAX_EXP,HASH_MASK_BITS,HASH_WINDOW_SIZE``
|
||||||
can be used to tune the chunker parameters, the default is:
|
can be used to tune the chunker parameters, the default is:
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue