we use "debug xxx" subcommands now. docs updated.
also makes "borg help" shorter as not all debug-xxx commands
show up, but just 1 main "debug" command.
all archives, all items are read to build a unified view.
files are represented by a same-name directory with the versions of the file.
A filename suffix computed by adler32(chunkids) is used to disambiguate the versions.
also: refactor code a little, create methods for leaves, inner nodes.
There are persistent questions why output from options like --list
and --stats doesn't show up. Also, borg currently isn't able to
show *just* the output for a given option (--list, --stats,
--show-rc, --show-version, or --progress), without other INFO level
messages.
The solution is to use more granular loggers, so that messages
specific to a given option goes to a logger designated for that
option. That option-specific logger can then be configured
separately from the regular loggers.
Those option-specific loggers can also be used as a hook in a
BORG_LOGGING_CONF config file to log the --list output to a separate
file, or send --stats output to a network socket where some daemon
could analyze it.
Steps:
- create an option-specific logger for each of the implied output options
- modify the messages specific to each option to go to the correct logger
- if an implied output option is passed, change the option-specific
logger (only) to log at INFO level
- test that root logger messages don't come through option-specific loggers
They shouldn't, per https://docs.python.org/3/howto/logging.html#logging-flow
but test just the same. Particularly test a message that can come from
remote repositories.
Fixes#526, #573, #665, #824
- Instead of very small (5 MB-ish) segment files, use larger ones
- Request asynchronous write-out or write-through (TODO) where it is supported,
to achieve a continuously high throughput for writes
- Instead of depending on ordered writes (write data, commit tag, sync)
for consistency, do a double-sync commit as more serious RDBMS also do
i.e. write data, sync, write commit tag, sync
Since commits are very expensive in Borg at the moment this makes no
difference performance-wise.
New platform APIs: SyncFile, sync_dir
[x] Naive implementation (equivalent to what Borg did before)
[x] Linux implementation
[ ] Windows implementation
[-] OSX implementation (F_FULLSYNC)
/mnt/backup was confusing as people like to mount their backup disk on /mnt/backup,
but borg init /mnt/backup does not work if that directory already exists because it is
the mountpoint. it would work, if /mnt was the mountpoint, but that is not obvious
and also unusual.
Use with caution: permanent data loss by specifying incorrect patterns
is easily possible. Make a dry run to make sure you got everything right.
borg recreate has many uses:
- Can selectively remove files/dirs from old archives, e.g. to free
space or purging picturarum biggus dickus from history
- Recompress data
- Rechunkify data, to have upgraded Attic / Borg 0.xx archives deduplicate
with Borg 1.x archives. (Or to experiment with chunker-params for
specific use cases
It is interrupt- and resumable.
Chunks are not freed on-the-fly.
Rationale:
Makes only sense when rechunkifying, but logic on which new chunks to
free what input chunks is complicated and *very* delicate.
Future TODOs:
- Refactor tests using py.test fixtures
-- would require porting ArchiverTestCase to py.test: many changes,
this changeset is already borderline too large.
- Possibly add a --target option to not replace the source archive
-- with the target possibly in another Repo
(better than "cp" due to full integrity checking, and deduplication
at the target)
- Detect and skip (unless --always-recompress) already recompressed chunks
Fixes#787#686#630#70 (and probably some I overlooked)
Also see #757 and #770