Commit Graph

42 Commits

Author SHA1 Message Date
Thomas Waldmann 54a85bf56d
format_timedelta: use 3 decimal digits (ms)
maybe this fixes the frequently failing test.
also, giving ms makes more sense than 10ms granularity.
2024-04-04 12:45:28 +02:00
Thomas Waldmann 1b6f928917
ro_type: typed repo objects, see #7670
writing: put type into repoobj metadata
reading: check wanted type against type we got

repoobj metadata is encrypted and authenticated.
repoobj data is encrypted and authenticated, also (separately).
encryption and decryption of both metadata and data get the
same "chunk ID" as AAD, so both are "bound" to that (same) ID.

a repo-side attacker can neither see cleartext metadata/data,
nor successfully tamper with it (AEAD decryption would fail).

also, a repo-side attacker could not replace a repoobj A with a
differently typed repoobj B without borg noticing:
- the metadata/data is cryptographically bound to its ID.
  authentication/decryption would fail on mismatch.
- the type check would fail.

thus, the problem (see CVEs in changelog) solved in borg 1 by the
manifest and archive TAMs is now already solved by the type check.
2023-09-24 20:10:50 +02:00
bigtedde d2f32986f3 removed TestCaseBase from testsuite/archive.py 2023-07-26 12:18:25 -07:00
Ted Lawson 7df34fc4a6
Archive pytest conversion (#7661)
parameterized stats_progress and timestamp_parsing tests
2023-07-11 01:14:51 +02:00
Peter Gerber 438cf2e7ef
Sanitize paths during archive creation/extraction/...
Paths are not always sanitized when creating an archive and,
more importantly, never when extracting one. The following example
shows how this can be used to attempt to write a file outside the
extraction directory:

$ echo abcdef | borg create -r ~/borg/a --stdin-name x/../../../../../etc/shadow archive-1 -
$ borg list -r ~/borg/a archive-1
-rw-rw---- root   root          7 Sun, 2022-10-23 19:14:27  x/../../../../../etc/shadow
$ mkdir borg/target
$ cd borg/target
$ borg extract -r ~/borg/a archive-1
x/../../../../../etc/shadow: makedirs: [Errno 13] Permission denied: '/home/user/borg/target/x/../../../../../etc'

Note that Borg tries to extract the file to /etc/shadow and the
permission error is a result of the user not having access.

This patch ensures file names are sanitized before archiving.
As for files extracted from the archive, paths are sanitized
by making all paths relative, removing '.' elements, and removing
superfluous slashes (as in '//'). '..' elements, however, are
rejected outright. The reasoning here is that it is easy to start
a path with './' or insert a '//' by accident (e.g. via --stdin-name
or import-tar). '..', however, seem unlikely to be the result
of an accident and could indicate a tampered repository.

With paths being sanitized as they are being read, this "errors"
will be corrected during the `borg transfer` required when upgrading
to Borg 2. Hence, the sanitation, when reading the archive,
can be removed once support for reading v1 repositories is dropped.
V2 repository will not contain non-sanitized paths. Of course,
a check for absolute paths and '..' elements needs to kept in
place to detect tempered archives.

I recommend treating this as a security issue. I see the following
cases where extracting a file outside the extraction path could
constitute a security risk:

a) When extraction is done as a different user than archive
creation. The user that created the archive may be able to
get a file overwritten as a different user.
b) When the archive is created on one host and extracted on
another. The user that created the archive may be able to
get a file overwritten on another host.
c) When an archive is created and extracted after a OS reinstall.
When a host is suspected compromised, it is common to reinstall
(or set up a new machine), extract the backups and then evaluate
their integrity. A user that manipulates the archive before such
a reinstall may be able to get a file overwritten outside the
extraction path and may evade integrity checks.

Notably absent is the creation and extraction on the same host as
the same user. In such case, an adversary must be assumed to be able
to replace any file directly.

This also (partially) fixes #7099.
2023-06-07 23:23:53 +02:00
Thomas Waldmann b0b32e35f5
tests: avoid long ids in pytest output
sometimes the automatically computed IDs are just too long,
so rather give IDs directly or avoid them otherwise.
2023-05-18 05:46:33 +02:00
Soumik Dutta cad138aa23
Add files changed while reading to Statistics class #7354 (#7378)
add files changed while reading to Statistics class, fixes #7354

Signed-off-by: Soumik Dutta <shalearkane@gmail.com>
2023-02-25 01:47:39 +01:00
Thomas Waldmann 7bd8e924eb
win32: omit some tests with non-existing user/group names
see comment in the code, they currently can't succeed.
2023-01-19 20:07:37 +01:00
Thomas Waldmann 1672aee031
Item: symlinks: rename .source to .target, fixes #7245
Also, in JSON:
- rename "linktarget" to "target" for symlinks
- remove "source" for symlinks
2023-01-16 20:28:25 +01:00
Thomas Waldmann ff545033e3
tests: do not look up uid 0 / gid 0, but current process uid/gid
some systems do not have uid/gid 0 (windows).
2023-01-16 18:17:15 +01:00
Thomas Waldmann 4f9cda1aab
get_item_uid_gid: do not require item.uid/gid, see #7249
if uid is not present, fall back to uid_default.
if gid is not present, fall back to gid_default.
2023-01-16 18:12:34 +01:00
Franco Ayala 2ed7f317d3
Adding performance statistics to borg create (#6991)
- file status A/M/E counters
- chunking time
- hashing time
- rx_bytes / tx_bytes

Note: the sleep() in the test is needed due to timestamp granularity on linux being much more coarse than expected (uses the system timer, 100Hz or 250Hz).
2022-10-19 21:40:02 +02:00
Thomas Waldmann 1e156ca02b fix upgrader 2022-09-07 19:23:11 +02:00
Thomas Waldmann fa986a9f19 repoobj: add a layer to format/parse repo objects
borg < 2:

obj = encrypted(compressed(data))

borg 2:

obj = enc_meta_len32 + encrypted(msgpacked(meta)) + encrypted(compressed(data))

handle compr / decompr in repoobj

move the assert_id call from decrypt to RepoObj.parse

also:
- for AEADKeyBase, add a dummy assert_id (not needed here)
- only test assert_id for other if not AEADKeyBase instance
- remove test_getting_wrong_chunk. assert_id is called elsewhere
  and is not needed any more anyway with the new AEAD crypto.
- only give manifest (includes key, repo, repo_objs)
- only return manifest from Manifest.load (includes key, repo, repo_objs)
2022-09-04 00:49:38 +02:00
Thomas Waldmann 9beaced33c move manifest module from helpers to borg.manifest 2022-08-13 21:55:12 +02:00
Thomas Waldmann 7957af562d blacken all the code
https://black.readthedocs.io/
2022-07-06 16:34:38 +02:00
Thomas Waldmann 31a081f695 simplify stats output
also:
- move stats related stuff to Statistics class
- repo ops give repo / overall stats
- archive ops give archive stats
- adapt tests
2022-06-23 16:00:12 +02:00
Thomas Waldmann 19dfbe5c5c compute the deduplicated size before compression
so we do not need csize for it.
2022-06-12 17:15:13 +02:00
Thomas Waldmann ace5957524 remove csize from item.chunks elements 2022-06-12 15:48:33 +02:00
Thomas Waldmann b9f9623a6d prepare to remove csize (set it to 0 for now) 2022-06-12 15:48:33 +02:00
Thomas Waldmann 8e87f1111b cleanup msgpack related str/bytes mess, fixes #968
see ticket and borg.helpers.msgpack docstring.

this changeset implements the full migration to
msgpack 2.0 spec (use_bin_type=True, raw=False).

still needed compat to the past is done via want_bytes decoder in borg.item.
2022-06-09 17:57:28 +02:00
Thomas Waldmann f8dbe5b542 cleanup msgpack related str/bytes mess, see #968
see ticket and borg.helpers.msgpack docstring.
2022-06-09 17:57:28 +02:00
Thomas Waldmann 1c0937958d show_progress: add finished=true/false to archive_progress json, fixes #6570
also:
- remove empty values from final json
- add test
2022-05-08 18:32:07 +02:00
Thomas Waldmann cbeef56454 pyupgrade --py38-plus ./**/*.py 2022-02-27 20:11:56 +01:00
Andrey Bienkowski 37506ca8af Refactor: remove assert_true (master)
Work toward https://github.com/borgbackup/borg/issues/28
2022-01-22 23:49:34 +03:00
Thomas Waldmann bbccdbd81c mount: implement --numeric-owner (default: False!), fixes #2377
this is different default behaviour than in borg < 1.2:

default (numeric_owner=False) is to use the user/group name from the archive,
look up the local uid / gid and then use that for the FUSE fs.

when --numeric-owner is given (numeric_owner=True), then the uid/gid
from the archive is directly used (as it was the default behaviour in
borg < 1.2).

this was implemented like this (changing the default behaviour) to make
borg mount and borg extract behave more similar considering usage of
user/group numeric archived ids or archived names mapped to corresponding
numeric local system ids.

also, both now use the same function to get the uid/gid from the item.

fuse:
- add user and group name entries to default_dir
- also: set internal_dict(!) of new Item with data from Item.as_dict()
2021-03-07 18:16:23 +01:00
Thomas Waldmann 3c173cc03b wrap msgpack, fixes #3632, fixes #2738
wrap msgpack to avoid future upstream api changes making troubles
or that we would have to globally spoil our code with extra params.

make sure the packing is always with use_bin_type=False,
thus generating "old" msgpack format (as borg always did) from
bytes objects.

make sure the unpacking is always with raw=True,
thus generating bytes objects.

note:

safe unicode encoding/decoding for some kinds of data types is done in Item
class (see item.pyx), so it is enough if we care for bytes objects on the
msgpack level.

also wrap exception handling, so borg code can catch msgpack specific
exceptions even if the upstream msgpack code raises way too generic
exceptions typed Exception, TypeError or ValueError.
We use own Exception classes for this, upstream classes are deprecated
2018-08-06 17:32:55 +02:00
Marian Beermann a976e11a63 create crypto package with key, keymanager, low_level 2017-05-02 20:49:27 +02:00
Marian Beermann 2ff75d58f2 remove Chunk() 2017-04-04 00:16:15 +02:00
Thomas Waldmann 945880af47 implement async_response, add wait=True for add_chunk/chunk_decref
Before this changeset, async responses were:
- if not an error: ignored
- if an error: raised as response to the arbitrary/unrelated next command

Now, after sending async commands,  the async_response command must be used
to process outstanding responses / exceptions.

We are avoiding to pile up lots of stuff in cases of high latency, because we do NOT
first wait until ALL responses have arrived, but we just can begin to process responses.
Calls with wait=False will just return what we already have received.
Repeated calls with wait=True until None is returned will fetch all responses.

Async commands now actually could have non-exception non-None results, but
this is not used yet. None responses are still dropped.

The motivation for this is to have a clear separation between a request
blowing up because it (itself) failed and failures unrelated to that request /
to that line in the sourcecode.

also: fix processing for async repo obj deletes

exception_ignored is a special object used that is "not None" (as None is used to signal
"finished with processing async results") but also not a potential async response result value.

Also:

added wait=True to chunk_decref() and add_chunk()

this makes async processing explicit - the default is synchronous and you only
need to be careful and do extra steps for async processing if you explicitly
request async by calling with wait=False (usually for speed reasons).

to process async results, use async_response, see above.
2017-03-26 17:33:19 +02:00
Marian Beermann 7923088ff9 check: pick better insufficent archives matched warning from TW's merge 2017-01-12 17:04:51 +01:00
Marian Beermann ecad0ed53a Merge branch '1.0-maint' into merge/1.0-maint
# Conflicts: ... everywhere ...
#	.travis.yml
#	Vagrantfile
#	borg/testsuite/key.py
#	docs/changes.rst
#	docs/quickstart.rst
#	docs/usage.rst
#	docs/usage/upgrade.rst.inc
#	src/borg/archive.py
#	src/borg/archiver.py
#	src/borg/crypto.pyx
#	src/borg/helpers.py
#	src/borg/key.py
#	src/borg/remote.py
#	src/borg/repository.py
#	src/borg/testsuite/archive.py
#	src/borg/testsuite/archiver.py
#	src/borg/testsuite/crypto.py
#	src/borg/testsuite/helpers.py
#	src/borg/testsuite/repository.py
#	src/borg/upgrader.py
#	tox.ini
2017-01-12 15:01:41 +01:00
Marian Beermann b3707f7175 Replace backup_io with a singleton
This is some 15 times faster than @contextmanager, because no instance
creation is involved and no generator has to be maintained. Overall
difference is low, but still nice for a very simple change.
2016-12-03 11:52:48 +01:00
Thomas Waldmann c8922c8b3d use ArchiveItem 2016-08-15 01:11:33 +02:00
Thomas Waldmann 42b6a838da fix cyclic import issue, fix tests
needed to increase ChunkBuffer size due to increased items stream chunk size
to get the test working.
2016-08-14 15:26:56 +02:00
Thomas Waldmann 770a892b2d implement borg info REPO
currently it is just the same global stats also shown in "borg info ARCHIVE",
just without the archive-specific stats.

also: add separate test for "borg info".
2016-08-02 20:06:24 +02:00
Thomas Waldmann f363ddd7ca Merge branch '1.0-maint' 2016-07-04 20:11:21 +02:00
Thomas Waldmann 87d6755108 Merge branch '1.0-maint' 2016-06-29 18:28:33 +02:00
Thomas Waldmann 9a64835b4d Merge branch '1.0-maint'
Also: add missing keys to ARCHIVE_KEYS set.
2016-06-13 00:14:08 +02:00
Thomas Waldmann 60da32123a refactor to use Item class, fixes #1071 2016-06-04 17:24:55 +02:00
Thomas Waldmann 3ce35f6843 Merge branch 'master' into move-to-src 2016-05-21 19:06:01 +02:00
Thomas Waldmann d1ea925a5b move borg package to src/ 2016-05-05 20:19:50 +02:00
Renamed from borg/testsuite/archive.py (Browse further)