they somehow pull in some floating point error code that led to a undefined
symbol FPE_... when using the borgbackup wheel on some non-ubuntu/debian linux
platform.
Using a decorator moves the duplicate code in the init methods into a
single decorator method, while still retaining the same runtime overhead
(zero for for the non-OSX path, one extra function call plus the call to
unicodedata.normalize for OSX). The pattern classes are much visually
cleaner, and duplicate code limited to two lines normalizing the pattern
on OSX.
Because the decoration happens at class init time (vs instance init time
for the previous approach), the OSX and non-OSX test cases can no longer
be called in the same run, so I also removed the OSX test case monkey
patching and uncommented the platform skipif decorator.
The OS X file system HFS+ stores path names as Unicode, and converts
them to a variant of Unicode NFD for storage. Because path names will
always be in this canonical form, it's not friendly to require users to
match this form exactly. Convert paths from the repository and patterns
from the command line to NFD before comparing them.
Unix (and Windows, I think) file systems don't convert path names into a
canonical form, so users will continue to have to exactly match the path
name they want, because there could be two paths with the same character
visually that are actually composed of different byte sequences.
the daemonize code changes the cwd, thus a relative repo path can't work.
borg mount repo mnt # did not work
borg mount --foreground repo mnt # did work
borg mount /abs/path/repo mnt # did work
sets the default repository to use, e.g. like:
export BORG_REPO=/mnt/backup/repo
borg init
borg create ::archive
borg list
borg mount :: /mnt
fusermount -u /mnt
borg delete ::archive
added a check that compares the size of the new chunk with the stored size of the
already existing chunk in storage that has the same id_hash value.
raise an exception if there is a size mismatch.
this could happen if:
- the stored size is somehow incorrect (corruption or software bug)
- we found a hash collision for the id_hash (for sha256, this is very unlikely)
the compression was quite cpu intensive and didn't work that great anyway.
now the disk space usage is a bit higher, but it is much faster and less hard on the cpu.
disk space needs grow linearly with the amount and size of the archives, this
is a problem esp. if one has many and/or big archives (but this problem existed
before also because compression was not as effective as I believed).
the tar archive always needed a complete rebuild (and thus: decompression
and recompression) because deleting outdated archive indexes was not
possible in the tar file.
now we just have a directory chunks.archive.d and keep archive index files
there for all archives we already know.
if an archive does not exist any more in the repo, we just delete its index file.
if an archive is unknown still, we fetch the infos and build a new index file.
when merging, we avoid growing the hash table from zero, but just start
with the first archive's index as basis for merging.
also remove the comment about how good xz compresses - while that was true for smaller index files,
it seems to be less effective with bigger ones. maybe just an issue with compression dict size.
outdated - it just showed different levels of zlib compression,
but not we additionally have "lzma", "lz4" and "none" compression.
the "usage" and "internals" docs give some hints about them, too.
This fixes a infrequent problem when (refcount * chunksize) overflowed a int32_t.
chunksize is always <= 8MiB and usually rather ~64KiB (with default chunker params).
Thus, this happened only for high refcounts and/or unusually big chunks.