note:
- we call this frequently AFTER re-filling the chunker buffer,
so even big input files have little cache impact.
- there is still some cache impact due to output files caching,
if the repository is on a locally mounted filesystem.
this safes some back-and-forth between C and Python code and also some memory
management overhead as we can always reuse the same read_buf instead of letting
Python allocate and free a up to 10MB big buffer for each buffer filling read.
we can't use os-level file descriptors all the time though, as chunkify gets also invoked
on objects like BytesIO that are not backed by a os-level file.
Note: this changeset is also a preparation for O_DIRECT support which can be
implemented a lot easier on C level.
shows original, compressed and deduped size plus path name.
output is 79 chars wide, so 80x24 terminal does not wrap/scroll.
long path names are shortened (in a rather simplistic way).
output happens when a new item is started, but not more often than 5/s
(thus, not every pathname is shown)
at the end, the output line is cleared but not scrolled, so it basically vanishes.
process_item was used only for dirs and fifo, replaced it by process_dir and process_fifo,
so the status can be generated there (as it is done for the other item types).
before this changesets, most informations about exceptions/tracebacks
on the remote side were lost. now they are transmitted and displayed,
together with the remote attic version.
don't catch "Exception" when OSError was meant (otherwise e.errno is not there anyway)
don't use bare "except:" if one can avoid (copied code fragment from similar handler)
added "nonlocal euid" - without this, euid just gets redefined in inner scope instead of assigned to outer scope
added check for euid 0 - if we run as root, we always have permissions (not just if we are file owner)
note: due to caching and OS behaviour on linux, the bug was a bit tricky to reproduce
and also the fix was a bit tricky to test.
one needs strictatime mount option to enfore traditional atime updating.
for repeated tests, always change file contents (e.g. from /dev/urandom) or attic's caching
will prevent that the file gets read ("accessed") again.
check atimes with ls -lu
i could reproduce code was broken and is fixed with this changeset. and root now doesn't touch any atimes.
Listing repositories with lots of archives on low-memory systems would cause attic to run out of memory due to items_buffer and chunker being created for each visited archive.
See https://github.com/jborg/attic/issues/163
this way, serve() is more consistent with the other code, which always uses os.read/write (not sys.std*.buffer.read/write).
also: reduce code duplication a bit.
Note: of course it can only check for orphaned objects, if it has processed all archives in the repo.
Thus this check is skipped as soon as you give --last N option.
The numbers shown in progress indicator are (N,T).
N is the number of the currently checked archive (starts at T as it first checks latest archive).
T is the total number of archives.