support reading new, improved hashindex header format, fixes#6960
Bit of a pain to work with that code:
- C code
- needs to still be able to read the old hashindex file format,
- while also supporting the new file format.
- the hash computed while reading the file causes additional problems because
it expects all places in the file get read exactly once and in sequential order.
I solved this by separately opening the file in the python part of the code and
checking for the magic.
BORG_IDX means the legacy file format and legacy layout of the hashtable,
BORG2IDX means the new file format and the new layout of the hashtable.
Done:
- added a version int32 directly after the magic and set it to 2 (like borg 2).
the old header had no version info, but could be denoted as version 1 in case
we ever need it (currently it decides based on the magic).
- added num_empty as indicated by a TODO in count_empty, so it does not need a
full hashtable scan to determine the amount of empty buckets.
- to keep it simpler, I just filled the HashHeader struct with a
`char reserved[1024 - 32];`
1024 being the desired overall header size and 32 being the currently used size.
this alignment might be useful in case we mmap() the hashindex file one day.
warning: src/borg/item.pyx:199:10: cpdef variables will not be supported in Cython 3; currently they are no different from cdef variables
warning: src/borg/item.pyx:200:10: cpdef variables will not be supported in Cython 3; currently they are no different from cdef variables
warning: src/borg/item.pyx:202:10: cpdef variables will not be supported in Cython 3; currently they are no different from cdef variables
this turns all python level classes into extension type classes.
additionally it turns the indirect properties into direct descriptors.
test_propdict_attributes runs about 30% faster.
base memory usage as reported by sys.getsizeof(Item()):
before: 48 bytes, after this PR: 40 bytes
Author: @RonnyPfannschmidt in PR #5763
reads all chunks in on-disk order and recompresses them if they are not already using
the desired compression type and level (and obfuscation level).
supports SIGINT/ctrl-c and --checkpoint-interval (default: 1800s).
this is a borg command that compacts when committing (without this, it would have
a huge space usage). it commits/compacts every checkpoint interval or when
pressing ctrl-c / receiving SIGINT.