document archive limitation, #1452

This commit is contained in:
Thomas Waldmann 2016-08-12 17:54:15 +02:00
parent a951d23d27
commit c834b2969c
2 changed files with 41 additions and 2 deletions

View File

@ -62,6 +62,17 @@ Which file types, attributes, etc. are *not* preserved?
holes in a sparse file. holes in a sparse file.
* filesystem specific attributes, like ext4 immutable bit, see :issue:`618`. * filesystem specific attributes, like ext4 immutable bit, see :issue:`618`.
Are there other known limitations?
----------------------------------
- A single archive can only reference a limited volume of file/dir metadata,
usually corresponding to tens or hundreds of millions of files/dirs.
When trying to go beyond that limit, you will get a fatal IntegrityError
exception telling that the (archive) object is too big.
An easy workaround is to create multiple archives with less items each.
See also the :ref:`archive_limitation` and :issue:`1452`.
Why is my backup bigger than with attic? Why doesn't |project_name| do compression by default? Why is my backup bigger than with attic? Why doesn't |project_name| do compression by default?
---------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------

View File

@ -160,12 +160,40 @@ object that contains:
* version * version
* name * name
* list of chunks containing item metadata * list of chunks containing item metadata (size: count * ~40B)
* cmdline * cmdline
* hostname * hostname
* username * username
* time * time
.. _archive_limitation:
Note about archive limitations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The archive is currently stored as a single object in the repository
and thus limited in size to MAX_OBJECT_SIZE (20MiB).
As one chunk list entry is ~40B, that means we can reference ~500.000 item
metadata stream chunks per archive.
Each item metadata stream chunk is ~128kiB (see hardcoded ITEMS_CHUNKER_PARAMS).
So that means the whole item metadata stream is limited to ~64GiB chunks.
If compression is used, the amount of storable metadata is bigger - by the
compression factor.
If the medium size of an item entry is 100B (small size file, no ACLs/xattrs),
that means a limit of ~640 million files/directories per archive.
If the medium size of an item entry is 2kB (~100MB size files or more
ACLs/xattrs), the limit will be ~32 million files/directories per archive.
If one tries to create an archive object bigger than MAX_OBJECT_SIZE, a fatal
IntegrityError will be raised.
A workaround is to create multiple archives with less items each, see
also :issue:`1452`.
The Item The Item
-------- --------
@ -174,7 +202,7 @@ Each item represents a file, directory or other fs item and is stored as an
``item`` dictionary that contains: ``item`` dictionary that contains:
* path * path
* list of data chunks * list of data chunks (size: count * ~40B)
* user * user
* group * group
* uid * uid