Commit Graph

57 Commits

Author SHA1 Message Date
Thomas Waldmann 9de07ebd46
update "modern" error RCs (docs and code) 2024-02-13 22:58:02 +01:00
Thomas Waldmann 1b6f928917
ro_type: typed repo objects, see #7670
writing: put type into repoobj metadata
reading: check wanted type against type we got

repoobj metadata is encrypted and authenticated.
repoobj data is encrypted and authenticated, also (separately).
encryption and decryption of both metadata and data get the
same "chunk ID" as AAD, so both are "bound" to that (same) ID.

a repo-side attacker can neither see cleartext metadata/data,
nor successfully tamper with it (AEAD decryption would fail).

also, a repo-side attacker could not replace a repoobj A with a
differently typed repoobj B without borg noticing:
- the metadata/data is cryptographically bound to its ID.
  authentication/decryption would fail on mismatch.
- the type check would fail.

thus, the problem (see CVEs in changelog) solved in borg 1 by the
manifest and archive TAMs is now already solved by the type check.
2023-09-24 20:10:50 +02:00
nain ffe237ce0c unify scanning and listing of segment dirs / segment files and apply good practices
+ os.scandir instead of os.listdir
  Improved speed and added flexibility with attributes (name, path, is_dir(), is_file())
+ use is_dir / is_file to make sure  we're reading only dirs / files respectively
+ Filtering to particular start, end index range built in
+ Move value bounds of segment (index) into constants module and use them instead

Resolves #7597

(forward patch from commits c9f35a16e9bf9e7073c486553177cef79ff1cb06^..edb5e749f512b7737b6933e13b7e61fefcd17bcb)
2023-05-27 07:54:32 -04:00
Thomas Waldmann 7f973a5b34
implement "fail" chunker for testing purposes
--chunker-params=fail,4096,rrrEErrrr means:
- cut chunks of 4096b fixed size (last chunk in a file can be less)
- read chunks 0, 1 and 2 successfully
- error at chunk 3 and 4 (simulated OSError(errno.EIO))
- read successfully again for the next 4 chunks

Chunks are counted inside the chunker instance, starting
from 0, always increasing while the same instance is used.

Read chunks as well as failed chunks count up by 1.
2023-02-13 17:15:45 +01:00
Thomas Waldmann b92f4aa487
remove --consider-part-files, related stats code, update docs
we now just treat that one .borg_part file we might have inside
checkpoint archives as a normal file.

people can recognize via the file name it is a partial file.

nobody cares for statistics of checkpoint files and the final
archive now does not contain any partial files any more, thus
no needs to maintain statistics about count and size of part
files.
2023-02-01 13:04:18 +01:00
Thomas Waldmann bf667170a7
ArchiveItem.cmdline list-of-str -> .command_line str, fixes #7246
Same change for .recreate_cmdline -> .recreate_command_line .

JSON output key "command_line":
borg 1.x: sys.argv [list of str]
borg 2: shlex.join(sys.argv) [str]
2023-01-20 00:19:00 +01:00
Thomas Waldmann 1672aee031
Item: symlinks: rename .source to .target, fixes #7245
Also, in JSON:
- rename "linktarget" to "target" for symlinks
- remove "source" for symlinks
2023-01-16 20:28:25 +01:00
Thomas Waldmann bab68a8d25 use py37+ datetime.isoformat / .fromisoformat
since python 3.7, .isoformat() is usable IF timespec != "auto"
is given ("auto" [default] would be as evil as before, sometimes
formatting with, sometimes without microseconds).

also since python 3.7, there is now .fromisoformat().
2022-08-11 21:18:56 +02:00
Thomas Waldmann a7bc1f88c1 constants: reorder stuff, add comments 2022-08-11 17:36:57 +02:00
Thomas Waldmann fb74fdb710 massively increase per archive metadata stream size limit, fixes #1473
implemented by introducing one level of indirection, the limit is now
very high, so it is not practically relevant any more.

we always use the indirection (storing the metadata stream chunk ids list not
directly into the archive item, but into some repo objects referenced by the new
ArchiveItem.item_ptrs list).

thus, the code behaves the same for all archive sizes.
2022-08-06 19:01:41 +02:00
Thomas Waldmann 7bc7f01342 remove remainders of attic legacy
we expect that everybody has upgraded to borg
using borg 1.2.x or older, thus we do not need
to care about attic repos any more in borg2.
2022-07-13 16:55:29 +02:00
Thomas Waldmann 7957af562d blacken all the code
https://black.readthedocs.io/
2022-07-06 16:34:38 +02:00
Andrey Bienkowski adb1c36215 black integration 2022-07-06 14:55:40 +02:00
Thomas Waldmann e0c64629d1 Merge branch 'master' into borg2
strange conflicts, automated patches seemed to not have applied correctly.
also had to fix some stuff manually, tests were failing.
2022-06-23 11:25:01 +02:00
Thomas Waldmann b9f9623a6d prepare to remove csize (set it to 0 for now) 2022-06-12 15:48:33 +02:00
Thomas Waldmann d4ee968b07 use borg 2.0 to refer to this, not 1.3
also, some type conversions are now done in update_internal once,
not in the decode methods of the classes in item.pyx.
2022-06-09 18:13:40 +02:00
Elmar Hoffmann c2317c4cce
make constants for files cache mode more clear (#6724)
* make constants for files cache mode more clear

Traditionally, DEFAULT_FILES_CACHE_MODE_UI and DEFAULT_FILES_CACHE_MODE
were - as the naming scheme implies - the same setting, one being the UI
representation as given to the --files-cache command line option and the
other being the same default value in the internal representation.

It happended that the actual value used in borg create always comes from
DEFAULT_FILES_CACHE_MODE_UI (because that does have the --files-cache
option) whereas for all other commands (that do not use the files cache) it
comes from DEFAULT_FILES_CACHE_MODE.

PR #5777 then abused this fact to implement the optimisation to skip loading
of the files cache in those other commands by changing the value of
DEFAULT_FILES_CACHE_MODE to disabled.
This however also changes the meaning of that variable and thus redesignates
it to something not matching the original naming anymore.

Anyone not aware of this change and the intention behind it looking at the
code would have a hard time figuring this out and be easily mislead.

This does away with the confusion making the code more maintainable by
renaming DEFAULT_FILES_CACHE_MODE to FILES_CACHE_MODE_DISABLED, making the
new intention of that internal default clear.

* make constant for files cache mode UI default match naming scheme
2022-05-30 14:01:19 +02:00
Thomas Waldmann e4a97ea8cc transfer: all hardlinks have chunks, maybe chunks_healty, hlid
Item.hlid: same id, same hardlink (xxh64 digest)
Item.hardlink_master: not used for new archives any more
Item.source: not used for hardlink slaves any more
2022-05-18 14:20:01 +02:00
Thomas Waldmann ed59159649 argon2 key: use chacha20-poly1305 instead of aes256-ctr + hmac-sha256, fixes #6601
so we can completely get rid of aes-ctr some day.
2022-04-16 11:52:33 +02:00
Thomas Waldmann 52f75d7722 repository: implement PUT2: header crc32, overall xxh64, fixes #1704
note: this required a slight increase of MAX_OBJECT_SIZE so that MAX_DATA_SIZE
      could stay the same as before.

For PUT2, compute the hash over the whole entry (header and content, excluding
hash and crc32 fields, because the crc32 computation includes the hash).

Also: refactor crc32 checks into function, use f-strings, structure _read in
a more logical sequential order.

write_put: avoid creating a large temporary bytes object

why use xxh64?
- fast even without hw acceleration
- borg depends on it already anyway
- stronger than crc32 and strong enough for this purpose
2022-04-09 18:58:47 +02:00
Andrey Andreyevich Bienkowski 56c27a99d0
Argon2 the second part: implement key encryption / decryption (#6469)
Argon2 the second part: implement encryption/decryption of argon2 keys

borg init --key-algorithm=argon2 (new default, older pbkdf2 also still available)

borg key change-passphrase: keep key algorithm the same
borg key change-location: keep key algorithm the same

use env var BORG_TESTONLY_WEAKEN_KDF=1 to resource limit (cpu, memory, ...) the kdf when running the automated tests.
2022-04-07 16:22:34 +02:00
Thomas Waldmann 0f6f278b0f crypto: AEAD key classes
also:

cleanup class structure: less inheritance, more mixins.

define type bytes using the 4:4 split

upper 4 bits are ciphersuite:
0 == legacy AES-CTR based stuff
1+ == new AEAD stuff

lower 4 bits are keytype:
legacy: a bit mixed up, as it was...
new stuff: 0=keyfile 1=repokey, ...
2022-03-26 17:05:35 +01:00
Thomas Waldmann a63614e35b move key type/storage constants to borg.constants 2022-03-11 23:05:32 +01:00
Thomas Waldmann 02a9db50d2 do not load files cache for commands not using it, fixes #5673 2021-04-19 22:40:21 +02:00
Thomas Waldmann be257728ca move zeros to constants module 2021-01-14 20:02:18 +01:00
Thomas Waldmann 8c299696aa Chunker: yield Chunk namedtuple instead of bytes/memoryview 2021-01-08 01:10:44 +01:00
Phil Kulin c0504c0669 create: implement --stdin-mode, --stdin-user and --stdin-group, #5333 2020-11-01 20:45:56 +03:00
Thomas Waldmann e0545e921d exit with 128 + signal number, fixes #5161
as documented:

https://borgbackup.readthedocs.io/en/stable/usage/general.html#return-codes

compatibility warning: in case you have scripts expecting rc == 2 for a
signal exit, you need to update them to check for >= 128.
2020-06-30 00:17:10 +02:00
Thomas Waldmann e569595974 include size/csize/nfiles[_parts] stats into archive, fixes #3241 2019-02-23 15:05:07 +01:00
Thomas Waldmann 58f177aa82 add comment about unused recreate_* members in ArchiveItem 2019-02-23 10:49:24 +01:00
Thomas Waldmann fc30a0765b remove ARCHIVE_KEYS duplication
also: get key set in sync, obviously we have "recreate_partial_chunks"
in ArchiveItem still.
2019-02-23 10:09:40 +01:00
Thomas Waldmann ac0803fe0b chunker algorithms: use constants to avoid typos 2019-02-13 04:36:09 +01:00
Thomas Waldmann c4ffbd2a17 prepare to support multiple chunkers 2019-02-13 04:24:14 +01:00
Thomas Waldmann 5b5546d7e9 avoid stale filehandle issues, fixes #3265 2018-06-24 01:29:15 +02:00
Thomas Waldmann 0e0e6da585 make sure all segment file offsets fit into uint32, fixes #3592
C code and the repo index use uint32 type for segment file offsets,
so when opening a repo and the config max_segment_size is too big,
fail early.

Also disallow setting a too big value via "borg config".
2018-03-05 17:50:53 +01:00
Sam H b0141c1dc9 include item birthtime in archive (where available) (#3313)
include item birthtime in archive, fixes #3272

* use `safe_ns` when reading birthtime into attributes
* proper order for `birthtime` in `ITEM_KEYS` list
* use `bigint` wrapper for consistency
* Add tests to verify that birthtime is normally preserved, but not preserved when `--nobirthtime` is passed to `borg create`.
2017-11-13 14:55:10 +01:00
Thomas Waldmann 5e2de8ba67 implement files cache mode control, fixes #911
You can now control the files cache mode using this option:

--files-cache={ctime,mtime,size,inode,rechunk,disabled}*

(only some combinations are supported)

Previously, only these modes were supported:
- mtime,size,inode (default of borg < 1.1.0rc4)
- mtime,size (by using --ignore-inode)
- disabled (by using --no-files-cache)

Now, you additionally get:
- ctime alternatively to mtime (more safe), e.g.:
  ctime,size,inode (this is the new default of borg >= 1.1.0rc4)
- rechunk (consider all files as changed, rechunk them)

Deprecated:
- --ignore-inodes (use modes without "inode")
- --no-files-cache (use "disabled" mode)

The tests needed some changes:
- previously, we use os.utime() to set a files mtime (atime) to specific
  values, but that does not work for ctime.
- now use time.sleep() to create the "latest file" that usually does
  not end up in the files cache (see FAQ)
2017-10-01 00:52:32 +02:00
Thomas Waldmann 1cb158a4b5 move ISO_FORMAT to constants module
also: add ISO_FORMAT_NO_USECS
2017-09-05 04:44:38 +02:00
Thomas Waldmann 5cc2b900ee constants: avoid comparing constants
lgtm:
Comparison of constants is always constant, but is harder to read than
a simple constant.
2017-07-23 02:00:55 +02:00
Thomas Waldmann 89f3cab6cd move get_limited_unpacker to helpers
also: move some constants to borg.constants
2017-06-25 23:36:28 +02:00
Thomas Waldmann 6c2c51939d Manifest: use limited unpacker 2017-06-25 23:36:28 +02:00
Andrea Gelmini e4247cc0d2 Fix typos 2017-06-09 16:49:30 +02:00
Thomas Waldmann ffcf6b76b6 DEFAULT_SEGMENTS_PER_DIR = 1000
prettier increments for the directory names.
2017-06-03 21:54:41 +02:00
Marian Beermann 8e24ddae06 increase DEFAULT_SEGMENTS_PER_DIR to 2000 2017-05-22 12:31:52 +02:00
Thomas Waldmann a52b54dc3c archived file items: add size metadata
if an item has a chunk list, pre-compute the total size and store it into "size" metadata entry.

this speeds up access to item size (e.g. for regular files) and could also be used to verify the validity of the chunks list.

note about hardlinks: size is only stored for hardlink masters (only they have an own chunk list)
2017-02-23 21:24:37 +01:00
Marian Beermann b0e4f13fba set MAX_DATA_SIZE = 20971479 bytes in solid stone 2017-02-23 00:34:40 +01:00
Marian Beermann 69f7810658 info: show utilization of maximum archive size
See #1452

This is 100 % accurate.

Also increases maximum data size by ~41 bytes. Not 100 % side-effect free;
if you manage to exactly land in that area then older Borg would not read
it. OTOH it gives us a nice round number there.
2017-02-22 23:47:21 +01:00
Marian Beermann bd4a0fe23b
Clarify cache/repository README file 2016-11-11 21:24:17 +01:00
Thomas Waldmann c8922c8b3d use ArchiveItem 2016-08-15 01:11:33 +02:00
Thomas Waldmann 8be6761c26 Merge commit 'feb7e2517ef7ec07cc638953a86c726aada7d37e' 2016-08-14 15:05:11 +02:00