docs: provide more details on object layout

While writing my own out-of-band decoder, I had a hard time figuring out
how to unpack the manifest. From the description, I was only able to
read that the manifest is msgpack'd, but I had not been able to figure
out that it's also going through the same encryption+compression logic
as all other things do.

This should make it a little clearer and provide the necessary
information to understand how the compression works.
This commit is contained in:
Jonas Schäfer 2022-07-29 22:36:57 +02:00
parent 3d94c595b0
commit bd89fb717e
1 changed files with 21 additions and 6 deletions

View File

@ -121,6 +121,14 @@ partial/uncommitted transaction.
The size of individual segments is limited to 4 GiB, since the offset of entries
within segments is stored in a 32-bit unsigned integer in the repository index.
Objects
~~~~~~~
All objects (the manifest, archives, archive item streams chunks and file data
chunks) are encrypted and/or compressed. See :ref:`data-encryption` for a
graphic outlining the anatomy of an object in Borg. The `type` for compression
is explained in :ref:`data-compression`.
Index, hints and integrity
~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -860,6 +868,8 @@ HashIndex is implemented in C and wrapped with Cython in a class-based interface
The Cython wrapper checks every passed value against these reserved values and
raises an AssertionError if they are used.
.. _data-encryption:
Encryption
----------
@ -961,18 +971,23 @@ key file, wrapped using the standard ``textwrap`` module with a header.
The header is a single line with a MAGIC string, a space and a hexadecimal
representation of the repository id.
.. _data-compression:
Compression
-----------
Borg supports the following compression methods:
Borg supports the following compression methods, each identified by two bytes:
- none (no compression, pass through data 1:1)
- lz4 (low compression, but super fast)
- none (no compression, pass through data 1:1), identified by ``\x00\x00``
- lz4 (low compression, but super fast), identified by ``\x01\x00``
- zstd (level 1-22 offering a wide range: level 1 is lower compression and high
speed, level 22 is higher compression and lower speed) - since borg 1.1.4
speed, level 22 is higher compression and lower speed) - since borg 1.1.4,
identified by ``\x03\x00``
- zlib (level 0-9, level 0 is no compression [but still adding zlib overhead],
level 1 is low, level 9 is high compression)
- lzma (level 0-9, level 0 is low, level 9 is high compression).
level 1 is low, level 9 is high compression), identified by a zlib header
(``\x.8\x..``)
- lzma (level 0-9, level 0 is low, level 9 is high compression), identified
by ``\x02\x00``.
Speed: none > lz4 > zlib > lzma, lz4 > zstd
Compression: lzma > zlib > lz4 > none, zstd > lz4