the compression was quite cpu intensive and didn't work that great anyway.
now the disk space usage is a bit higher, but it is much faster and less hard on the cpu.
disk space needs grow linearly with the amount and size of the archives, this
is a problem esp. if one has many and/or big archives (but this problem existed
before also because compression was not as effective as I believed).
the tar archive always needed a complete rebuild (and thus: decompression
and recompression) because deleting outdated archive indexes was not
possible in the tar file.
now we just have a directory chunks.archive.d and keep archive index files
there for all archives we already know.
if an archive does not exist any more in the repo, we just delete its index file.
if an archive is unknown still, we fetch the infos and build a new index file.
when merging, we avoid growing the hash table from zero, but just start
with the first archive's index as basis for merging.
also remove the comment about how good xz compresses - while that was true for smaller index files,
it seems to be less effective with bigger ones. maybe just an issue with compression dict size.
outdated - it just showed different levels of zlib compression,
but not we additionally have "lzma", "lz4" and "none" compression.
the "usage" and "internals" docs give some hints about them, too.
This fixes a infrequent problem when (refcount * chunksize) overflowed a int32_t.
chunksize is always <= 8MiB and usually rather ~64KiB (with default chunker params).
Thus, this happened only for high refcounts and/or unusually big chunks.
e.g.:
- setting any security.* key is expected to fail with EACCES if one is not root.
- issue #162 on our issue tracker: user was root, but due to some specific scenario
involving docker and selinux, setting security.selinux key fails even when running as root
not sure if it is the best solution to silently ignore this, but some lines below this change
failure to do a chown is also silently ignored (happens e.g. when restoring a file not owned
by the current user as a non-root user).
if we use {} as default for item.get(), we do not need the "if" as iteration over an empty dict won't do anything.
also fixes too deep indentation the original code had.
the parser for the --chunker-params argument had a wrong parameter order.
fixed the order so it conforms to the help text and the docs.
also added some tests for it and a text for the ValueError exception.
currently, we only use sha256 hashes as key, so key length is always 32.
but instead of hardcoding 32 everywhere, using key_length is just better
readable and also more flexible for the future.
borg list --short just spills out the list of files / dirs - better for some tests
and also useful on the commandline for interactive use.
the tests previously needed fakeroot because in the test setup it always
made calls to mknod and chown, which require (fake)root.
now, the tests adapt to whether it detects (fake)root or not - to run the
the tests completely, you still need fakeroot, but it won't fail all the archiver
tests just due to failing test setup.
also, a test not working correctly due to fakeroot was found:
it should detect whether a read-only repo is usable, but it failed to do that
because with (fake)root, there is no "read only" (at least not via taking away
the w permission bits).
environment context manager: if a env var was not present before, it should not be present afterwards
teardown: cd out of the tmpdir before deleting it
I re-wrote lrucache (and it seems like no-one had looked at it much
before :). I was told my test function would have been simpler in
native py.test, so let's have a go converting it all.
We can avoid any reference to unittest, because lrucache doesn't write
files so it doesn't need any of our custom assertion helpers.