- doesn't need a separate file for the hash
- we can later write multiple partial chunkindexes to the cache
also:
add upgrade code that renames the cache from previous borg versions.
Consider soft-deleted archives/ directory entries, but only create a new
archives/ directory entry if:
- there is no entry for that archive ID
- there is no soft-deleted entry for that archive ID either
Support running with or without --repair.
Without --repair, it can be used to detect such inconsistencies and return with rc != 0.
--repository-only contradicts --find-lost-archives.
We are only interested in archive metadata objects here, thus for most repo objects
it is enough to read the repoobj's metadata and determine the object's type.
Only if it is the right type of object, we need to read the full object (metadata
and data).
This reverts commit d3f3082bf4.
Comment by jdchristensen:
I agree that "wipe clean" is correct grammar, but it doesn't match the situation in "unmount cleanly".
The change in this patch is definitely wrong.
Putting it another way, one would never say that we "clean unmount a filesystem".
We say that we "cleanly unmount a filesystem", or in other words, that it "unmounts cleanly".
But the original text is slightly awkward, so I would propose: "When running in the foreground,
^C/SIGINT cleanly unmounts the filesystem, but other signals or crashes do not."
(Not that this guarantees anything, but I'm a native speaker.)
We gave up refcounting quite a while ago and are only interested
in whether a chunk is used (referenced) or not (orphan).
So, let's keep that uint32_t value, but use it for bit flags, so
we could use it to efficiently remember other chunk-related stuff also.
If we have an entry for a chunk id in the ChunkIndex,
it means that this chunk exists in the repository.
The code was a bit over-complicated and used entry.refcount
only to detect whether .get(id, default) actually got something
from the ChunkIndex or used the provided default value.
The code does the same now, but in a simpler way.
Additionally, it checks for size consistency if a size is
provided by the caller and a size is already present in
the entry.
- refactor packing/unpacking of fc entries into separate functions
- instead of a chunks list entry being a tuple of 256bit id [bytes] and 32bit size [int],
only store a stable 32bit index into kv array of ChunkIndex (where we also have id and
size [and refcount]).
- only done in memory, the on-disk format has (id, size) tuples.
memory consumption (N = entry.chunks list element count, X = overhead for rest of entry):
- previously:
- packed = packb(dict(..., chunks=[(id1, size1), (id2, size2), ...]))
- packed size ~= X + N * (1 + (34 + 5)) Bytes
- now:
- packed = packb(dict(..., chunks=[ix1, ix2, ...]))
- packed size ~= X + N * 5 Bytes