- simplify progress output (no \r, no terminal size related tweaks)
- emit progress output via the logging system (so it does not use stderr
of borg serve)
- progress code always logs a json string, the json has all needed
to either do json log output or plain text log output.
- use formatters to generate plain or json output from that.
- clean up setup_logging
- use a StderrHandler that always uses the **current** sys.stderr
- tweak TestPassphrase to not accidentally trigger just because of seeing 12 in output
fix config dir compatibility issue, fixes#7445
- add tests
- make sure the result of get_cache_dir matches pre and post #7300 where desired
- harmonize implementation of config_dir_compat and cache_dir_compat tests
Co-authored-by: nain <126972030+F49FF806@users.noreply.github.com>
diff: include changes in ctime and mtime, fixes#7248
also:
- sort JSON output alphabetically
- add --content-only to ignore metadata changes
Co-authored-by: Michael Deyaso <mdeyaso@fusioniq.io>
--chunker-params=fail,4096,rrrEErrrr means:
- cut chunks of 4096b fixed size (last chunk in a file can be less)
- read chunks 0, 1 and 2 successfully
- error at chunk 3 and 4 (simulated OSError(errno.EIO))
- read successfully again for the next 4 chunks
Chunks are counted inside the chunker instance, starting
from 0, always increasing while the same instance is used.
Read chunks as well as failed chunks count up by 1.
check --archives: add --newer/--older/--newest/--oldest, fixes#7062
Options accept a timespan, like Nd for N days or Nm for N months.
Use these to do date-based matching on archives and only check some of them,
like: borg check --archives --newer=1m --newest=7d
Author: Michael Deyaso <mdeyaso@fusioniq.io>
Same change for .recreate_cmdline -> .recreate_command_line .
JSON output key "command_line":
borg 1.x: sys.argv [list of str]
borg 2: shlex.join(sys.argv) [str]
if they are present, process them through json_text().
this replaces s-e by "?" for the key and puts the binary
representation into key_b64, if needed.
likely this is rarely needed.
item: path, source, user, group
for non-unicode stuff borg 1.2 had "bpath".
now we have:
path - unicode approximation (invalid stuff replaced by ?)
path_b64 - base64(path_bytes) # only if needed
source has the same issue as path and is now covered also.
user and group are usually unicode or even pure ASCII,
but we rather are cautious and cover them also.
binary bytes:
- json_key = <key>_b64
- json_value == base64(value)
text (potentially with surrogate escapes):
- json_key1 = <key>
- json_value1 = value_text (s-e replaced by ?)
- json_key2 = <key>_b64
- json_value2 = base64(value_binary)
json_key2/_value2 is only present if value_text required
replacement of surrogate escapes (and thus does not represent
the original value, but just an approximation).
value_binary then gives the original bytes value (e.g. a
non-utf8 bytes sequence).
\n is automatically converted on write to the platform-dependent os.linesep.
Using os.linesep instead of \n means that on Windows, the line ending becomes "\r\r\n".
Also switches mentions of {LF} to {NL} in code and docs.
we want to be able to use an archive name as a directory name,
e.g. for the FUSE fs built by borg mount.
thus we can not allow "/" in an archive name on linux.
on windows, the rules are more restrictive, disallowing
quite some more characters (':<>"|*?' plus some more).
we do not have FUSE fs / borg mount on windows yet, but
we better avoid any issues.
we can not avoid ":" though, as our {now} placeholder
generates ISO-8601 timestamps, including ":" chars.
also, we do not want to have leading/trailing blanks in
archive names, neither surrogate-escapes.
control chars are disallowed also, including chr(0).
we have python str here, thus chr(0) is not expected in there
(is not used to terminate a string, like it is in C).
the UNIX time used for timestamp is seconds since 1.1.1970,
in UTC. thus, the natural way to represent it is with a
tz-aware utc datetime object.
but previously (in borg 1.x), they used naive datetime
objects and localtime.
borg < 2:
obj = encrypted(compressed(data))
borg 2:
obj = enc_meta_len32 + encrypted(msgpacked(meta)) + encrypted(compressed(data))
handle compr / decompr in repoobj
move the assert_id call from decrypt to RepoObj.parse
also:
- for AEADKeyBase, add a dummy assert_id (not needed here)
- only test assert_id for other if not AEADKeyBase instance
- remove test_getting_wrong_chunk. assert_id is called elsewhere
and is not needed any more anyway with the new AEAD crypto.
- only give manifest (includes key, repo, repo_objs)
- only return manifest from Manifest.load (includes key, repo, repo_objs)
- timezone aware timestamps
- str representation with +HHMM or +HH:MM
- get rid of to_locatime
- fix with_timestamp
- have archive start/end time always in local time with tz or as given
- idea: do not lose tz information
then we know when a backup was made and even from
which timezone it was made. if we want to compute
utc, we can do that using these infos.
this makes a quite nice archives list, with timestamps
as expected (in local time with timezone info).
at some places we just enforce utc, like for the
repo manifest timestamp or for the transaction log,
these are usually not looked at by the user.
since python 3.7, .isoformat() is usable IF timespec != "auto"
is given ("auto" [default] would be as evil as before, sometimes
formatting with, sometimes without microseconds).
also since python 3.7, there is now .fromisoformat().
There are some other places with subprocesses:
- borg create --content-from-command
- borg create --paths-from-command
- (de)compression filter process of import-tar / export-tar
hopefully this is the final fix.
after first fixing of #6400 (by using os.umask after mkstemp), there
was a new problem that chmod was not supported on some fs.
even after fixing that, there were other issues, see the ACLs issue
documented in #6933.
the root cause of all this is tempfile.mkstemp internally using a
very secure, but hardcoded and for our use case problematic mode
of 0o600.
mkstemp_mode (mosty copy&paste from python stdlib tempfile module +
"black" formatting applied) supports giving the mode via the api,
that is the only change needed.
slightly dirty due to the _xxx imports from tempfile, but hopefully
this will be supported in some future python version.
if a hardlink copy of a repo was made and a new repo config
shall be saved, do NOT fill in random garbage before deleting
the previous repo config, because that would damage the hardlink
copy.
see ticket and borg.helpers.msgpack docstring.
this changeset implements the full migration to
msgpack 2.0 spec (use_bin_type=True, raw=False).
still needed compat to the past is done via want_bytes decoder in borg.item.
This not only brings code style in line with the other helpers that do the
same thing this way, but also does away with an unnecessary absolute import
using the borg module name explicitly.
borg now has the chunks list in every item with content.
due to the symmetric way how borg now deals with hardlinks using
item.hlid, processing gets much simpler.
but some places where borg deals with other "sources" of hardlinks
still need to do some hardlink management:
borg uses the HardLinkManager there now (which is not much more
than a dict, but keeps documentation at one place and avoids some
code duplication we had before).
item.hlid is computed via hardlink_id function.
support hardlinked symlinks, fixes#2379
as we use item.hlid now to group hardlinks together,
there is no conflict with the item.source usage for
symlink targets any more.
2nd+ hardlinks now add to the files count as did the 1st one.
for borg, now all hardlinks are created equal.
so any hardlink item with chunks now adds to the "file" count.
ItemFormatter: support {hlid} instead of {source} for hardlinks