it's pretty useless to have .borg as a directory extension, especially
since there is a README in there stating that this is a borg repo.
conistency:
"backup" is always used as relative backup repository path
"/mnt/backup" is always used as absolute repository path
use borg instead attic except at the places where it was used:
- as toplevel package name, directory name, file name
- to refer to original attic
remove sphinx upload make command, will be replaced by github.io site later
remove references to binary downloads and linux packages for now
remove some software name references, fix grammar
use borgbackup rather than borg-backup (or borg) in URLs,
less name collision issues, better search results, no validity issues with "-"
Implemented sparse file support to remove this blocker for people backing up lots of
huge sparse files (like VM images). Attic could not support this use case yet as it would
have restored all files to their fully expanded size, possibly running out of disk space if
the total expanded size would be bigger than the available space.
Please note that this is a very simple implementation of sparse file support - at backup time,
it does not do anything special (it just reads all these zero bytes, chunks, compresses and
encrypts them as usual). At restore time, it detects chunks that are completely filled with zeros
and does a seek on the output file rather than a normal data write, so it creates a hole in
a sparse file. The chunk size for these all-zero chunks is currently 10MiB, so it'll create holes
of multiples of that size (depends also a bit on fs block size, alignment, previously written data).
Special cases like sparse files starting and/or ending with a hole are supported.
Please note that it will currently always create sparse files at restore time if it detects all-zero
chunks.
Also improved:
I needed a constant for the max. chunk size, so I introduced CHUNK_MAX (see also
existing CHUNK_MIN) for the maximum chunk size (which is the same as the chunk
buffer size).
Attic still always uses 10MiB chunk buffer size now, but it could be changed now more easily.
datetime.isoformat() has different output depending on whether
microseconds are zero or not. Add test cases to ensure we handle both
cases correctly in an archive.
There were some small issues:
a) it never called EVP_EncryptFinal_ex.
For CTR mode, this had no visible consequences as EVP_EncryptUpdate already yielded all ciphertext.
For cleanliness and to have correctness even in other modes, the missing call was added.
b) decrypt = encrypt hack
This is a nice hack to abbreviate, but it only works for modes without padding and without authentication.
For cleanliness and to have correctness even in other modes, the missing usage of the decrypt api was added.
c) outl == inl assumption
Again, True for CTR mode, but not for padding or authenticating modes.
Fixed so it computes the ciphertext / plaintext length based on api return values.
Other changes:
As encrypt and decrypt API calls are different even for initialization/reset, added a is_encrypt flag.
Defensive output buffer allocation. Added the length of one extra AES block (16bytes) so it would
work even with padding modes. 16bytes are needed because a full block of padding might get
added when the plaintext was a multiple of aes block size.
These changes are based on some experimental code I did for aes-cbc and aes-gcm.
While we likely won't ever want aes-cbc in attic (maybe gcm though?), I think it is cleaner
to not make too many mode specific assumptions and hacks, but just use the API as it
was meant to be used.
Without this check, the client is able to call any method of
RepositoryServer and Repository, potentially circumventing
restrict_to_paths or even run arbitrary code.
This normalizes the file names in the dot directory when specified explicitly,
along with exclude/include patterns.
This fixes several mismatches when including relative paths that involve the
current directory.