update docs

This commit is contained in:
Thomas Waldmann 2022-09-08 21:52:32 +02:00
parent 5afe94883a
commit c8830cde44
3 changed files with 46 additions and 17 deletions

View File

@ -77,7 +77,7 @@ don't have a particular meaning (except for the Manifest_).
Normally the keys are computed like this::
key = id = id_hash(unencrypted_data)
key = id = id_hash(plaintext_data) # plain = not encrypted, not compressed, not obfuscated
The id_hash function depends on the :ref:`encryption mode <borg_rcreate>`.
@ -98,15 +98,15 @@ followed by a number of log entries. Each log entry consists of (in this order):
* crc32 checksum (uint32):
- for PUT2: CRC32(size + tag + key + digest)
- for PUT: CRC32(size + tag + key + data)
- for PUT: CRC32(size + tag + key + payload)
- for DELETE: CRC32(size + tag + key)
- for COMMIT: CRC32(size + tag)
* size (uint32) of the entry (including the whole header)
* tag (uint8): PUT(0), DELETE(1), COMMIT(2) or PUT2(3)
* key (256 bit) - only for PUT/PUT2/DELETE
* data (size - 41 bytes) - only for PUT
* xxh64 digest (64 bit) = XXH64(size + tag + key + data) - only for PUT2
* data (size - 41 - 8 bytes) - only for PUT2
* payload (size - 41 bytes) - only for PUT
* xxh64 digest (64 bit) = XXH64(size + tag + key + payload) - only for PUT2
* payload (size - 41 - 8 bytes) - only for PUT2
PUT2 is new since repository version 2. For new log entries PUT2 is used.
PUT is still supported to read version 1 repositories, but not generated any more.
@ -116,7 +116,7 @@ version 2+.
Those files are strictly append-only and modified only once.
When an object is written to the repository a ``PUT`` entry is written
to the file containing the object id and data. If an object is deleted
to the file containing the object id and payload. If an object is deleted
a ``DELETE`` entry is appended with the object id.
A ``COMMIT`` tag is written when a repository transaction is
@ -130,13 +130,42 @@ partial/uncommitted transaction.
The size of individual segments is limited to 4 GiB, since the offset of entries
within segments is stored in a 32-bit unsigned integer in the repository index.
Objects
~~~~~~~
Objects / Payload structure
~~~~~~~~~~~~~~~~~~~~~~~~~~~
All data (the manifest, archives, archive item stream chunks and file data
chunks) is compressed, optionally obfuscated and encrypted. This produces some
additional metadata (size and compression information), which is separately
serialized and also encrypted.
See :ref:`data-encryption` for a graphic outlining the anatomy of the encryption in Borg.
What you see at the bottom there is done twice: once for the data and once for the metadata.
An object (the payload part of a segment file log entry) must be like:
- length of encrypted metadata (16bit unsigned int)
- encrypted metadata (incl. encryption header), when decrypted:
- msgpacked dict with:
- ctype (compression type 0..255)
- clevel (compression level 0..255)
- csize (overall compressed (and maybe obfuscated) data size)
- psize (only when obfuscated: payload size without the obfuscation trailer)
- size (uncompressed size of the data)
- encrypted data (incl. encryption header), when decrypted:
- compressed data (with an optional all-zero-bytes obfuscation trailer)
This new, more complex repo v2 object format was implemented to be able to efficiently
query the metadata without having to read, transfer and decrypt the (usually much bigger)
data part.
The metadata is encrypted to not disclose potentially sensitive information that could be
used for e.g. fingerprinting attacks.
The compression `ctype` and `clevel` is explained in :ref:`data-compression`.
All objects (the manifest, archives, archive item streams chunks and file data
chunks) are encrypted and/or compressed. See :ref:`data-encryption` for a
graphic outlining the anatomy of an object in Borg. The `type` for compression
is explained in :ref:`data-compression`.
Index, hints and integrity
~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -855,7 +884,7 @@ For each borg invocation, a new sessionkey is derived from the borg key material
and the 48bit IV starts from 0 again (both ciphers internally add a 32bit counter
to our IV, so we'll just count up by 1 per chunk).
The chunk layout is best seen at the bottom of this diagram:
The encryption layout is best seen at the bottom of this diagram:
.. figure:: encryption-aead.png
:figwidth: 100%
@ -954,14 +983,14 @@ representation of the repository id.
Compression
-----------
Borg supports the following compression methods, each identified by a type
byte:
Borg supports the following compression methods, each identified by a ctype value
in the range between 0 and 255 (and augmented by a clevel 0..255 value for the
compression level):
- none (no compression, pass through data 1:1), identified by 0x00
- lz4 (low compression, but super fast), identified by 0x01
- zstd (level 1-22 offering a wide range: level 1 is lower compression and high
speed, level 22 is higher compression and lower speed) - since borg 1.1.4,
identified by 0x03
speed, level 22 is higher compression and lower speed) - identified by 0x03
- zlib (level 0-9, level 0 is no compression [but still adding zlib overhead],
level 1 is low, level 9 is high compression), identified by 0x05
- lzma (level 0-9, level 0 is low, level 9 is high compression), identified

Binary file not shown.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 136 KiB

After

Width:  |  Height:  |  Size: 146 KiB