hashindex_index returns the perfect hashtable index, but does not
check what's in the bucket there, so we had these loops afterwards
to search for an empty or deleted bucket.
problem: if the HT were completely filled with no empty and no deleted
buckets, that loop would never end. due to our HT resizing, it can
never happen, but still not pretty.
when using hashindex_lookup (as also used some lines above), the code
is easier to understand, because (after we resized the HT), we freshly
create the same situation as after the first call of that function:
- return value < 0, because we (still) can not find the key
- start_idx will point to an empty bucket
Thus, we do not need the problematic loops we had there.
Modified the checks to make sure we really have an empty or deleted
bucket before overwriting it with data.
Added some additional asserts to make sure the code behaves.
if we run into some issue reading an input file, e.g. an I/O error,
the BackupOSError exception raised due to that will skip the current
file and no archive item will be created for this file.
But we maybe have already added some of its content chunks to the repo,
we have either written them as new chunks or incref'd some identical chunk
in the repo.
Added an exception handler that decrefs (and deletes if refcount reaches 0)
these chunks again before re-raising the exception, so the repo is in a
consistent state again and we do not have orphaned content chunks in the repo.
we now just treat that one .borg_part file we might have inside
checkpoint archives as a normal file.
people can recognize via the file name it is a partial file.
nobody cares for statistics of checkpoint files and the final
archive now does not contain any partial files any more, thus
no needs to maintain statistics about count and size of part
files.
checkpoint archives might have a single, incomplete part file as last item.
part files are always a prefix of the full file, growing in size from
checkpoint to checkpoint.
we now manage the archive items metadata stream in a special way:
- checkpoint archive A(n) might end with a partial item PI(n)
- checkpoint archive A(n+1) does not contain PI(n)
- checkpoint archive A(n+1) contains a new partial item PI(n+1)
- the final archive does not contain any partial items
not having this had created orphaned item_ptrs chunks for checkpoint archives.
also:
- borg check: show id of orphaned chunks
- borg check: archive list with explicit consider_checkpoints=True (this is the default, but better make sure).
check --archives: add --newer/--older/--newest/--oldest, fixes#7062
Options accept a timespan, like Nd for N days or Nm for N months.
Use these to do date-based matching on archives and only check some of them,
like: borg check --archives --newer=1m --newest=7d
Author: Michael Deyaso <mdeyaso@fusioniq.io>
Same change for .recreate_cmdline -> .recreate_command_line .
JSON output key "command_line":
borg 1.x: sys.argv [list of str]
borg 2: shlex.join(sys.argv) [str]