Note: of course it can only check for orphaned objects, if it has processed all archives in the repo.
Thus this check is skipped as soon as you give --last N option.
The numbers shown in progress indicator are (N,T).
N is the number of the currently checked archive (starts at T as it first checks latest archive).
T is the total number of archives.
Added a indicator character to the left for (A)dded, (M)odified, (U)nchanged status
of regular files. Lowercase indicators are for special files.
You may or may not want to use grep to filter out U and d.
When given, attic does not use the "files" cache. Saves about 240B RAM per file
(that sounds only a little, but consider that backups nowadays are often millions of files).
So try this if attic eats more memory than you have as RAM (usually means paging or
MemoryErrors). Of course, saving memory is not for free. In my one experiment, run time
increased from 3.5 to 23 minutes (my system has enough RAM).
https://www.openssl.org/docs/crypto/EVP_aes_256_cbc.html
EVP_DecryptInit_ex(), EVP_DecryptUpdate() and EVP_DecryptFinal_ex() are the corresponding decryption operations. EVP_DecryptFinal() will return an error code if padding is enabled and the final block is not correctly formatted. The parameters and restrictions are identical to the encryption operations except that if padding is enabled the decrypted data buffer out passed to EVP_DecryptUpdate() should have sufficient room for (inl + cipher_block_size) bytes unless the cipher block size is 1 in which case inl bytes is sufficient.
I doubt this is correct, but let's rather be defensive here.
There were some small issues:
a) it never called EVP_EncryptFinal_ex.
For CTR mode, this had no visible consequences as EVP_EncryptUpdate already yielded all ciphertext.
For cleanliness and to have correctness even in other modes, the missing call was added.
b) decrypt = encrypt hack
This is a nice hack to abbreviate, but it only works for modes without padding and without authentication.
For cleanliness and to have correctness even in other modes, the missing usage of the decrypt api was added.
c) outl == inl assumption
Again, True for CTR mode, but not for padding or authenticating modes.
Fixed so it computes the ciphertext / plaintext length based on api return values.
Other changes:
As encrypt and decrypt API calls are different even for initialization/reset, added a is_encrypt flag.
Defensive output buffer allocation. Added the length of one extra AES block (16bytes) so it would
work even with padding modes. 16bytes are needed because a full block of padding might get
added when the plaintext was a multiple of aes block size.
These changes are based on some experimental code I did for aes-cbc and aes-gcm.
While we likely won't ever want aes-cbc in attic (maybe gcm though?), I think it is cleaner
to not make too many mode specific assumptions and hacks, but just use the API as it
was meant to be used.
Instead of giving all files a fixed block count of 1, this assigns each
deduplicated chunk to a certain file. In effect, the cumulative file
size that is shown in the mountpoint accurately reflects the amount of
actual disk space needed for the repository (barring metadata overhead).
Although the block assignment is done arbitrarily, depending on the
user's access pattern, the sizes will be consistent within the entire
mount point. This facilitates the use of tools like du and ncdu for
inspecting the actual disk usage in a repository as opposed to just
looking at the original, uncompressed, non-deduplicated file sizes.
Without this check, the client is able to call any method of
RepositoryServer and Repository, potentially circumventing
restrict_to_paths or even run arbitrary code.