--chunker-params=fail,4096,rrrEErrrr means:
- cut chunks of 4096b fixed size (last chunk in a file can be less)
- read chunks 0, 1 and 2 successfully
- error at chunk 3 and 4 (simulated OSError(errno.EIO))
- read successfully again for the next 4 chunks
Chunks are counted inside the chunker instance, starting
from 0, always increasing while the same instance is used.
Read chunks as well as failed chunks count up by 1.
- file status A/M/E counters
- chunking time
- hashing time
- rx_bytes / tx_bytes
Note: the sleep() in the test is needed due to timestamp granularity on linux being much more coarse than expected (uses the system timer, 100Hz or 250Hz).
a file map can be:
- created internally inside chunkify by calling sparsemap, which uses
SEEK_DATA / SEEK_HOLE to determine data and hole ranges inside a
seekable sparse file.
Usage: borg create --sparse --chunker-params=fixed,BLOCKSIZE ...
BLOCKSIZE is the chunker blocksize here, not the filesystem blocksize!
- made by some other means and given to the chunkify function.
this is not used yet, but in future this could be used to only read
the changed parts and seek over the (known) unchanged parts of a file.
sparsemap: the generate range sizes are multiples of the fs block size.
the tests assume 4kiB fs block size.
With the argument specified as unsigned char *, Cython emits
code in the Python wrapper to convert string-like objects to
unsigned char* (essentially PyBytes_AS_STRING).
Because the len(data) call is performed on a cdef'd string-ish type,
Cython emits a strlen() call, on the result of PyBytes_AS_STRING.
This is not correct, since embedded null bytes are entirely possible.
Incidentally, the code generated by Cython was also not correct,
since the Clang Static Analyzer found a path of execution where
passing arguments in a weird way from Python resulted in strlen(NULL).
Formulated like this, Cython emits essentially:
c_buzhash(
PyBytes_AS_STRING(data),
PyObject_Length(data),
...
)
which is correct.