Archive timestamps are stored as the output of datetime.isoformat().
This function omits microseconds in the string output if the
microseconds are zero (as documented and explained at
https://bugs.python.org/issue7342).
Parsing of timestamps assumes there are always microseconds present
after a decimal point. This is not always true. Handle this case where
it is not true by explicitly using '0' microseconds when not present.
This commit fixes#282
datetime.isoformat() has different output depending on whether
microseconds are zero or not. Add test cases to ensure we handle both
cases correctly in an archive.
less calls to posix_fadvise (which seem to force a write-cache sync-to-disk and
wait for that to complete) - if we call it after we synced anyway, we don't lose time.
also: fixed a bug in the os.fsync call, it needs the fileno.
note:
- we call this frequently AFTER re-filling the chunker buffer,
so even big input files have little cache impact.
- there is still some cache impact due to output files caching,
if the repository is on a locally mounted filesystem.
this safes some back-and-forth between C and Python code and also some memory
management overhead as we can always reuse the same read_buf instead of letting
Python allocate and free a up to 10MB big buffer for each buffer filling read.
we can't use os-level file descriptors all the time though, as chunkify gets also invoked
on objects like BytesIO that are not backed by a os-level file.
Note: this changeset is also a preparation for O_DIRECT support which can be
implemented a lot easier on C level.
sure it is "prettier" without, but a lot of useful information for debugging is lost if the traceback is not shown.
even for KeyboardInterrupt:
it may have some bad reason when one has to use Ctrl-C - if attic was stuck somewhere, we want to know where it was.
shows original, compressed and deduped size plus path name.
output is 79 chars wide, so 80x24 terminal does not wrap/scroll.
long path names are shortened (in a rather simplistic way).
output happens when a new item is started, but not more often than 5/s
(thus, not every pathname is shown)
at the end, the output line is cleared but not scrolled, so it basically vanishes.
process_item was used only for dirs and fifo, replaced it by process_dir and process_fifo,
so the status can be generated there (as it is done for the other item types).
before this changesets, most informations about exceptions/tracebacks
on the remote side were lost. now they are transmitted and displayed,
together with the remote attic version.
don't catch "Exception" when OSError was meant (otherwise e.errno is not there anyway)
don't use bare "except:" if one can avoid (copied code fragment from similar handler)
added "nonlocal euid" - without this, euid just gets redefined in inner scope instead of assigned to outer scope
added check for euid 0 - if we run as root, we always have permissions (not just if we are file owner)
note: due to caching and OS behaviour on linux, the bug was a bit tricky to reproduce
and also the fix was a bit tricky to test.
one needs strictatime mount option to enfore traditional atime updating.
for repeated tests, always change file contents (e.g. from /dev/urandom) or attic's caching
will prevent that the file gets read ("accessed") again.
check atimes with ls -lu
i could reproduce code was broken and is fixed with this changeset. and root now doesn't touch any atimes.
Listing repositories with lots of archives on low-memory systems would cause attic to run out of memory due to items_buffer and chunker being created for each visited archive.
See https://github.com/jborg/attic/issues/163