Commit Graph

18 Commits

Author SHA1 Message Date
Thomas Waldmann 0fcd3e9479
add_chunk: remove overwrite parameter 2023-09-23 00:10:35 +02:00
Thomas Waldmann b0b32e35f5
tests: avoid long ids in pytest output
sometimes the automatically computed IDs are just too long,
so rather give IDs directly or avoid them otherwise.
2023-05-18 05:46:33 +02:00
TW c29d4a096b
Hashindex header work, fixes #6960 (#7064)
support reading new, improved hashindex header format, fixes #6960

Bit of a pain to work with that code:
- C code
- needs to still be able to read the old hashindex file format,
- while also supporting the new file format.
- the hash computed while reading the file causes additional problems because
  it expects all places in the file get read exactly once and in sequential order.
  I solved this by separately opening the file in the python part of the code and
  checking for the magic.
  BORG_IDX means the legacy file format and legacy layout of the hashtable,
  BORG2IDX means the new file format and the new layout of the hashtable.

Done:
- added a version int32 directly after the magic and set it to 2 (like borg 2).
  the old header had no version info, but could be denoted as version 1 in case
  we ever need it (currently it decides based on the magic).
- added num_empty as indicated by a TODO in count_empty, so it does not need a
  full hashtable scan to determine the amount of empty buckets.
- to keep it simpler, I just filled the HashHeader struct with a
  `char reserved[1024 - 32];`
  1024 being the desired overall header size and 32 being the currently used size.
  this alignment might be useful in case we mmap() the hashindex file one day.
2022-10-02 14:35:21 +02:00
Thomas Waldmann 1e156ca02b fix upgrader 2022-09-07 19:23:11 +02:00
Thomas Waldmann fa986a9f19 repoobj: add a layer to format/parse repo objects
borg < 2:

obj = encrypted(compressed(data))

borg 2:

obj = enc_meta_len32 + encrypted(msgpacked(meta)) + encrypted(compressed(data))

handle compr / decompr in repoobj

move the assert_id call from decrypt to RepoObj.parse

also:
- for AEADKeyBase, add a dummy assert_id (not needed here)
- only test assert_id for other if not AEADKeyBase instance
- remove test_getting_wrong_chunk. assert_id is called elsewhere
  and is not needed any more anyway with the new AEAD crypto.
- only give manifest (includes key, repo, repo_objs)
- only return manifest from Manifest.load (includes key, repo, repo_objs)
2022-09-04 00:49:38 +02:00
Thomas Waldmann 9beaced33c move manifest module from helpers to borg.manifest 2022-08-13 21:55:12 +02:00
Thomas Waldmann 7957af562d blacken all the code
https://black.readthedocs.io/
2022-07-06 16:34:38 +02:00
Thomas Waldmann ef24dafb15 tests: use less RepoKey/KeyfileKey 2022-06-30 20:52:48 +02:00
Thomas Waldmann 2c1f7951c4 remove csize from ChunkIndexEntry 2022-06-12 17:15:13 +02:00
Thomas Waldmann ace5957524 remove csize from item.chunks elements 2022-06-12 15:48:33 +02:00
Thomas Waldmann 2211b840a3 verbose files cache logging via --debug-topic=files_cache, fixes #5659 2021-02-28 22:39:44 +01:00
Thomas Waldmann dc2a57af47 use pytest.fixture instead of yield_fixture, fixes #5575
/vagrant/borg/borg/.tox/py36-none/lib/python3.6/site-packages/borg/testsuite/remote.py:73:
    PytestDeprecationWarning: @pytest.yield_fixture is deprecated.
Use @pytest.fixture instead; they are the same.
Docs: https://docs.pytest.org/en/stable/warnings.html
2020-12-20 00:11:04 +01:00
Thomas Waldmann da6d1ac538 support msgpack 1.0.0, fixes #5065
our data structures need strict_map_key=False, which is not the
default of msgpack 1.0.0. i made it default in our wrapper API.

call our wrapper for performance profile creation/conversion also
to avoid msgpack compat issues.

remove encoding from wrapper api, we do not use it any more.

remove raw is True check, we need false for profiles

strict_map_key is only supported for msgpack >= 1.0.0.
2020-04-04 22:04:45 +02:00
Thomas Waldmann b1e7e7f90a
cleanup: get rid of Cache.do_files, replace with cache_mode
not do_files == (cache_mode == 'd')  # d)isabled
2018-03-24 17:04:20 -07:00
Thomas Waldmann 4e0f369d0a fix borg create never showing M status
the problem was that the upper layer code did not have enough information
about the file, whether it is known or not - and thus, could not decide
correctly whether status should be M)odified or A)dded.

now, file_known_and_unchanged method returns an additional "known"
boolean to fix this.

also: add comment about files cache loading in cache_mode='r'
2018-02-26 11:07:20 +01:00
Marian Beermann 5eeca3493b TestAdHocCache 2017-06-18 02:01:27 +02:00
Marian Beermann 5af66dbb12 cache sync: add more refcount tests 2017-06-03 15:02:27 +02:00
Marian Beermann c786a5941e CacheSynchronizer: redo as quasi FSM on top of unpack.h
This is a (relatively) simple state machine running in the
data callbacks invoked by the msgpack unpacking stack machine
(the same machine is used in msgpack-c and msgpack-python,
changes are minor and cosmetic, e.g. removal of msgpack_unpack_object,
removal of the C++ template thus porting to C and so on).

Compared to the previous solution this has multiple advantages
- msgpack-c dependency is removed
- this approach is faster and requires fewer and smaller
  memory allocations

Testability of the two solutions does not differ in my
professional opinion(tm).

Two other changes were rolled up; _hashindex.c can be compiled
without Python.h again (handy for fuzzing and testing);
a "small" bug in the cache sync was fixed which allocated too
large archive indices, leading to excessive archive.chunks.d
disk usage (that actually gave me an idea).
2017-06-02 17:43:15 +02:00