mirror of https://github.com/borgbackup/borg.git
Merge pull request #6992 from ThomasWaldmann/separate-encrypted-metadata
Separate encrypted metadata
This commit is contained in:
commit
36e852457a
|
@ -77,7 +77,7 @@ don't have a particular meaning (except for the Manifest_).
|
|||
|
||||
Normally the keys are computed like this::
|
||||
|
||||
key = id = id_hash(unencrypted_data)
|
||||
key = id = id_hash(plaintext_data) # plain = not encrypted, not compressed, not obfuscated
|
||||
|
||||
The id_hash function depends on the :ref:`encryption mode <borg_rcreate>`.
|
||||
|
||||
|
@ -98,15 +98,15 @@ followed by a number of log entries. Each log entry consists of (in this order):
|
|||
|
||||
* crc32 checksum (uint32):
|
||||
- for PUT2: CRC32(size + tag + key + digest)
|
||||
- for PUT: CRC32(size + tag + key + data)
|
||||
- for PUT: CRC32(size + tag + key + payload)
|
||||
- for DELETE: CRC32(size + tag + key)
|
||||
- for COMMIT: CRC32(size + tag)
|
||||
* size (uint32) of the entry (including the whole header)
|
||||
* tag (uint8): PUT(0), DELETE(1), COMMIT(2) or PUT2(3)
|
||||
* key (256 bit) - only for PUT/PUT2/DELETE
|
||||
* data (size - 41 bytes) - only for PUT
|
||||
* xxh64 digest (64 bit) = XXH64(size + tag + key + data) - only for PUT2
|
||||
* data (size - 41 - 8 bytes) - only for PUT2
|
||||
* payload (size - 41 bytes) - only for PUT
|
||||
* xxh64 digest (64 bit) = XXH64(size + tag + key + payload) - only for PUT2
|
||||
* payload (size - 41 - 8 bytes) - only for PUT2
|
||||
|
||||
PUT2 is new since repository version 2. For new log entries PUT2 is used.
|
||||
PUT is still supported to read version 1 repositories, but not generated any more.
|
||||
|
@ -116,7 +116,7 @@ version 2+.
|
|||
Those files are strictly append-only and modified only once.
|
||||
|
||||
When an object is written to the repository a ``PUT`` entry is written
|
||||
to the file containing the object id and data. If an object is deleted
|
||||
to the file containing the object id and payload. If an object is deleted
|
||||
a ``DELETE`` entry is appended with the object id.
|
||||
|
||||
A ``COMMIT`` tag is written when a repository transaction is
|
||||
|
@ -130,13 +130,42 @@ partial/uncommitted transaction.
|
|||
The size of individual segments is limited to 4 GiB, since the offset of entries
|
||||
within segments is stored in a 32-bit unsigned integer in the repository index.
|
||||
|
||||
Objects
|
||||
~~~~~~~
|
||||
Objects / Payload structure
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
All data (the manifest, archives, archive item stream chunks and file data
|
||||
chunks) is compressed, optionally obfuscated and encrypted. This produces some
|
||||
additional metadata (size and compression information), which is separately
|
||||
serialized and also encrypted.
|
||||
|
||||
See :ref:`data-encryption` for a graphic outlining the anatomy of the encryption in Borg.
|
||||
What you see at the bottom there is done twice: once for the data and once for the metadata.
|
||||
|
||||
An object (the payload part of a segment file log entry) must be like:
|
||||
|
||||
- length of encrypted metadata (16bit unsigned int)
|
||||
- encrypted metadata (incl. encryption header), when decrypted:
|
||||
|
||||
- msgpacked dict with:
|
||||
|
||||
- ctype (compression type 0..255)
|
||||
- clevel (compression level 0..255)
|
||||
- csize (overall compressed (and maybe obfuscated) data size)
|
||||
- psize (only when obfuscated: payload size without the obfuscation trailer)
|
||||
- size (uncompressed size of the data)
|
||||
- encrypted data (incl. encryption header), when decrypted:
|
||||
|
||||
- compressed data (with an optional all-zero-bytes obfuscation trailer)
|
||||
|
||||
This new, more complex repo v2 object format was implemented to be able to efficiently
|
||||
query the metadata without having to read, transfer and decrypt the (usually much bigger)
|
||||
data part.
|
||||
|
||||
The metadata is encrypted to not disclose potentially sensitive information that could be
|
||||
used for e.g. fingerprinting attacks.
|
||||
|
||||
The compression `ctype` and `clevel` is explained in :ref:`data-compression`.
|
||||
|
||||
All objects (the manifest, archives, archive item streams chunks and file data
|
||||
chunks) are encrypted and/or compressed. See :ref:`data-encryption` for a
|
||||
graphic outlining the anatomy of an object in Borg. The `type` for compression
|
||||
is explained in :ref:`data-compression`.
|
||||
|
||||
Index, hints and integrity
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
@ -855,7 +884,7 @@ For each borg invocation, a new sessionkey is derived from the borg key material
|
|||
and the 48bit IV starts from 0 again (both ciphers internally add a 32bit counter
|
||||
to our IV, so we'll just count up by 1 per chunk).
|
||||
|
||||
The chunk layout is best seen at the bottom of this diagram:
|
||||
The encryption layout is best seen at the bottom of this diagram:
|
||||
|
||||
.. figure:: encryption-aead.png
|
||||
:figwidth: 100%
|
||||
|
@ -954,14 +983,14 @@ representation of the repository id.
|
|||
Compression
|
||||
-----------
|
||||
|
||||
Borg supports the following compression methods, each identified by a type
|
||||
byte:
|
||||
Borg supports the following compression methods, each identified by a ctype value
|
||||
in the range between 0 and 255 (and augmented by a clevel 0..255 value for the
|
||||
compression level):
|
||||
|
||||
- none (no compression, pass through data 1:1), identified by 0x00
|
||||
- lz4 (low compression, but super fast), identified by 0x01
|
||||
- zstd (level 1-22 offering a wide range: level 1 is lower compression and high
|
||||
speed, level 22 is higher compression and lower speed) - since borg 1.1.4,
|
||||
identified by 0x03
|
||||
speed, level 22 is higher compression and lower speed) - identified by 0x03
|
||||
- zlib (level 0-9, level 0 is no compression [but still adding zlib overhead],
|
||||
level 1 is low, level 9 is high compression), identified by 0x05
|
||||
- lzma (level 0-9, level 0 is low, level 9 is high compression), identified
|
||||
|
|
Binary file not shown.
Binary file not shown.
Before Width: | Height: | Size: 136 KiB After Width: | Height: | Size: 146 KiB |
|
@ -48,6 +48,7 @@ from .item import Item, ArchiveItem, ItemDiff
|
|||
from .platform import acl_get, acl_set, set_flags, get_flags, swidth, hostname
|
||||
from .remote import cache_if_remote
|
||||
from .repository import Repository, LIST_SCAN_LIMIT
|
||||
from .repoobj import RepoObj
|
||||
|
||||
has_link = hasattr(os, "link")
|
||||
|
||||
|
@ -262,9 +263,9 @@ def OsOpen(*, flags, path=None, parent_fd=None, name=None, noatime=False, op="op
|
|||
|
||||
|
||||
class DownloadPipeline:
|
||||
def __init__(self, repository, key):
|
||||
def __init__(self, repository, repo_objs):
|
||||
self.repository = repository
|
||||
self.key = key
|
||||
self.repo_objs = repo_objs
|
||||
|
||||
def unpack_many(self, ids, *, filter=None, preload=False):
|
||||
"""
|
||||
|
@ -308,8 +309,9 @@ class DownloadPipeline:
|
|||
yield item
|
||||
|
||||
def fetch_many(self, ids, is_preloaded=False):
|
||||
for id_, data in zip(ids, self.repository.get_many(ids, is_preloaded=is_preloaded)):
|
||||
yield self.key.decrypt(id_, data)
|
||||
for id_, cdata in zip(ids, self.repository.get_many(ids, is_preloaded=is_preloaded)):
|
||||
_, data = self.repo_objs.parse(id_, cdata)
|
||||
yield data
|
||||
|
||||
|
||||
class ChunkBuffer:
|
||||
|
@ -368,7 +370,7 @@ class CacheChunkBuffer(ChunkBuffer):
|
|||
self.stats = stats
|
||||
|
||||
def write_chunk(self, chunk):
|
||||
id_, _ = self.cache.add_chunk(self.key.id_hash(chunk), chunk, self.stats, wait=False)
|
||||
id_, _ = self.cache.add_chunk(self.key.id_hash(chunk), {}, chunk, stats=self.stats, wait=False)
|
||||
self.cache.repository.async_response(wait=False)
|
||||
return id_
|
||||
|
||||
|
@ -391,12 +393,12 @@ def get_item_uid_gid(item, *, numeric, uid_forced=None, gid_forced=None, uid_def
|
|||
return uid, gid
|
||||
|
||||
|
||||
def archive_get_items(metadata, key, repository):
|
||||
def archive_get_items(metadata, *, repo_objs, repository):
|
||||
if "item_ptrs" in metadata: # looks like a v2+ archive
|
||||
assert "items" not in metadata
|
||||
items = []
|
||||
for id, data in zip(metadata.item_ptrs, repository.get_many(metadata.item_ptrs)):
|
||||
data = key.decrypt(id, data)
|
||||
for id, cdata in zip(metadata.item_ptrs, repository.get_many(metadata.item_ptrs)):
|
||||
_, data = repo_objs.parse(id, cdata)
|
||||
ids = msgpack.unpackb(data)
|
||||
items.extend(ids)
|
||||
return items
|
||||
|
@ -406,16 +408,16 @@ def archive_get_items(metadata, key, repository):
|
|||
return metadata.items
|
||||
|
||||
|
||||
def archive_put_items(chunk_ids, *, key, cache=None, stats=None, add_reference=None):
|
||||
def archive_put_items(chunk_ids, *, repo_objs, cache=None, stats=None, add_reference=None):
|
||||
"""gets a (potentially large) list of archive metadata stream chunk ids and writes them to repo objects"""
|
||||
item_ptrs = []
|
||||
for i in range(0, len(chunk_ids), IDS_PER_CHUNK):
|
||||
data = msgpack.packb(chunk_ids[i : i + IDS_PER_CHUNK])
|
||||
id = key.id_hash(data)
|
||||
id = repo_objs.id_hash(data)
|
||||
if cache is not None and stats is not None:
|
||||
cache.add_chunk(id, data, stats)
|
||||
cache.add_chunk(id, {}, data, stats=stats)
|
||||
elif add_reference is not None:
|
||||
cdata = key.encrypt(id, data)
|
||||
cdata = repo_objs.format(id, {}, data)
|
||||
add_reference(id, len(data), cdata)
|
||||
else:
|
||||
raise NotImplementedError
|
||||
|
@ -435,8 +437,6 @@ class Archive:
|
|||
|
||||
def __init__(
|
||||
self,
|
||||
repository,
|
||||
key,
|
||||
manifest,
|
||||
name,
|
||||
cache=None,
|
||||
|
@ -458,10 +458,12 @@ class Archive:
|
|||
iec=False,
|
||||
):
|
||||
self.cwd = os.getcwd()
|
||||
self.key = key
|
||||
self.repository = repository
|
||||
self.cache = cache
|
||||
assert isinstance(manifest, Manifest)
|
||||
self.manifest = manifest
|
||||
self.key = manifest.repo_objs.key
|
||||
self.repo_objs = manifest.repo_objs
|
||||
self.repository = manifest.repository
|
||||
self.cache = cache
|
||||
self.stats = Statistics(output_json=log_json, iec=iec)
|
||||
self.iec = iec
|
||||
self.show_progress = progress
|
||||
|
@ -488,7 +490,7 @@ class Archive:
|
|||
end = datetime.now().astimezone() # local time with local timezone
|
||||
self.end = end
|
||||
self.consider_part_files = consider_part_files
|
||||
self.pipeline = DownloadPipeline(self.repository, self.key)
|
||||
self.pipeline = DownloadPipeline(self.repository, self.repo_objs)
|
||||
self.create = create
|
||||
if self.create:
|
||||
self.items_buffer = CacheChunkBuffer(self.cache, self.key, self.stats)
|
||||
|
@ -507,12 +509,13 @@ class Archive:
|
|||
self.load(info.id)
|
||||
|
||||
def _load_meta(self, id):
|
||||
data = self.key.decrypt(id, self.repository.get(id))
|
||||
cdata = self.repository.get(id)
|
||||
_, data = self.repo_objs.parse(id, cdata)
|
||||
metadata = ArchiveItem(internal_dict=msgpack.unpackb(data))
|
||||
if metadata.version not in (1, 2): # legacy: still need to read v1 archives
|
||||
raise Exception("Unknown archive metadata version")
|
||||
# note: metadata.items must not get written to disk!
|
||||
metadata.items = archive_get_items(metadata, self.key, self.repository)
|
||||
metadata.items = archive_get_items(metadata, repo_objs=self.repo_objs, repository=self.repository)
|
||||
return metadata
|
||||
|
||||
def load(self, id):
|
||||
|
@ -626,7 +629,9 @@ Duration: {0.duration}
|
|||
if name in self.manifest.archives:
|
||||
raise self.AlreadyExists(name)
|
||||
self.items_buffer.flush(flush=True)
|
||||
item_ptrs = archive_put_items(self.items_buffer.chunks, key=self.key, cache=self.cache, stats=self.stats)
|
||||
item_ptrs = archive_put_items(
|
||||
self.items_buffer.chunks, repo_objs=self.repo_objs, cache=self.cache, stats=self.stats
|
||||
)
|
||||
duration = timedelta(seconds=time.monotonic() - self.start_monotonic)
|
||||
if timestamp is None:
|
||||
end = datetime.now().astimezone() # local time with local timezone
|
||||
|
@ -660,9 +665,9 @@ Duration: {0.duration}
|
|||
metadata.update(additional_metadata or {})
|
||||
metadata = ArchiveItem(metadata)
|
||||
data = self.key.pack_and_authenticate_metadata(metadata.as_dict(), context=b"archive")
|
||||
self.id = self.key.id_hash(data)
|
||||
self.id = self.repo_objs.id_hash(data)
|
||||
try:
|
||||
self.cache.add_chunk(self.id, data, self.stats)
|
||||
self.cache.add_chunk(self.id, {}, data, stats=self.stats)
|
||||
except IntegrityError as err:
|
||||
err_msg = str(err)
|
||||
# hack to avoid changing the RPC protocol by introducing new (more specific) exception class
|
||||
|
@ -699,7 +704,7 @@ Duration: {0.duration}
|
|||
for id, chunk in zip(self.metadata.items, self.repository.get_many(self.metadata.items)):
|
||||
pi.show(increase=1)
|
||||
add(id)
|
||||
data = self.key.decrypt(id, chunk)
|
||||
_, data = self.repo_objs.parse(id, chunk)
|
||||
sync.feed(data)
|
||||
unique_size = archive_index.stats_against(cache.chunks)[1]
|
||||
pi.finish()
|
||||
|
@ -962,7 +967,7 @@ Duration: {0.duration}
|
|||
del metadata.items
|
||||
data = msgpack.packb(metadata.as_dict())
|
||||
new_id = self.key.id_hash(data)
|
||||
self.cache.add_chunk(new_id, data, self.stats)
|
||||
self.cache.add_chunk(new_id, {}, data, stats=self.stats)
|
||||
self.manifest.archives[self.name] = (new_id, metadata.time)
|
||||
self.cache.chunk_decref(self.id, self.stats)
|
||||
self.id = new_id
|
||||
|
@ -1011,7 +1016,7 @@ Duration: {0.duration}
|
|||
for (i, (items_id, data)) in enumerate(zip(items_ids, self.repository.get_many(items_ids))):
|
||||
if progress:
|
||||
pi.show(i)
|
||||
data = self.key.decrypt(items_id, data)
|
||||
_, data = self.repo_objs.parse(items_id, data)
|
||||
unpacker.feed(data)
|
||||
chunk_decref(items_id, stats)
|
||||
try:
|
||||
|
@ -1228,7 +1233,7 @@ class ChunksProcessor:
|
|||
|
||||
def chunk_processor(chunk):
|
||||
chunk_id, data = cached_hash(chunk, self.key.id_hash)
|
||||
chunk_entry = cache.add_chunk(chunk_id, data, stats, wait=False)
|
||||
chunk_entry = cache.add_chunk(chunk_id, {}, data, stats=stats, wait=False)
|
||||
self.cache.repository.async_response(wait=False)
|
||||
return chunk_entry
|
||||
|
||||
|
@ -1666,6 +1671,7 @@ class ArchiveChecker:
|
|||
logger.error("Repository contains no apparent data at all, cannot continue check/repair.")
|
||||
return False
|
||||
self.key = self.make_key(repository)
|
||||
self.repo_objs = RepoObj(self.key)
|
||||
if verify_data:
|
||||
self.verify_data()
|
||||
if Manifest.MANIFEST_ID not in self.chunks:
|
||||
|
@ -1674,7 +1680,7 @@ class ArchiveChecker:
|
|||
self.manifest = self.rebuild_manifest()
|
||||
else:
|
||||
try:
|
||||
self.manifest, _ = Manifest.load(repository, (Manifest.Operation.CHECK,), key=self.key)
|
||||
self.manifest = Manifest.load(repository, (Manifest.Operation.CHECK,), key=self.key)
|
||||
except IntegrityErrorBase as exc:
|
||||
logger.error("Repository manifest is corrupted: %s", exc)
|
||||
self.error_found = True
|
||||
|
@ -1765,7 +1771,7 @@ class ArchiveChecker:
|
|||
chunk_data_iter = self.repository.get_many(chunk_ids)
|
||||
else:
|
||||
try:
|
||||
self.key.decrypt(chunk_id, encrypted_data, decompress=decompress)
|
||||
self.repo_objs.parse(chunk_id, encrypted_data, decompress=decompress)
|
||||
except IntegrityErrorBase as integrity_error:
|
||||
self.error_found = True
|
||||
errors += 1
|
||||
|
@ -1796,7 +1802,7 @@ class ArchiveChecker:
|
|||
# from the underlying media.
|
||||
try:
|
||||
encrypted_data = self.repository.get(defect_chunk)
|
||||
self.key.decrypt(defect_chunk, encrypted_data, decompress=decompress)
|
||||
self.repo_objs.parse(defect_chunk, encrypted_data, decompress=decompress)
|
||||
except IntegrityErrorBase:
|
||||
# failed twice -> get rid of this chunk
|
||||
del self.chunks[defect_chunk]
|
||||
|
@ -1844,7 +1850,7 @@ class ArchiveChecker:
|
|||
pi.show()
|
||||
cdata = self.repository.get(chunk_id)
|
||||
try:
|
||||
data = self.key.decrypt(chunk_id, cdata)
|
||||
_, data = self.repo_objs.parse(chunk_id, cdata)
|
||||
except IntegrityErrorBase as exc:
|
||||
logger.error("Skipping corrupted chunk: %s", exc)
|
||||
self.error_found = True
|
||||
|
@ -1890,7 +1896,7 @@ class ArchiveChecker:
|
|||
|
||||
def add_callback(chunk):
|
||||
id_ = self.key.id_hash(chunk)
|
||||
cdata = self.key.encrypt(id_, chunk)
|
||||
cdata = self.repo_objs.format(id_, {}, chunk)
|
||||
add_reference(id_, len(chunk), cdata)
|
||||
return id_
|
||||
|
||||
|
@ -1913,7 +1919,7 @@ class ArchiveChecker:
|
|||
def replacement_chunk(size):
|
||||
chunk = Chunk(None, allocation=CH_ALLOC, size=size)
|
||||
chunk_id, data = cached_hash(chunk, self.key.id_hash)
|
||||
cdata = self.key.encrypt(chunk_id, data)
|
||||
cdata = self.repo_objs.format(chunk_id, {}, data)
|
||||
return chunk_id, size, cdata
|
||||
|
||||
offset = 0
|
||||
|
@ -2032,7 +2038,7 @@ class ArchiveChecker:
|
|||
return True, ""
|
||||
|
||||
i = 0
|
||||
archive_items = archive_get_items(archive, self.key, repository)
|
||||
archive_items = archive_get_items(archive, repo_objs=self.repo_objs, repository=repository)
|
||||
for state, items in groupby(archive_items, missing_chunk_detector):
|
||||
items = list(items)
|
||||
if state % 2:
|
||||
|
@ -2044,7 +2050,7 @@ class ArchiveChecker:
|
|||
unpacker.resync()
|
||||
for chunk_id, cdata in zip(items, repository.get_many(items)):
|
||||
try:
|
||||
data = self.key.decrypt(chunk_id, cdata)
|
||||
_, data = self.repo_objs.parse(chunk_id, cdata)
|
||||
unpacker.feed(data)
|
||||
for item in unpacker:
|
||||
valid, reason = valid_item(item)
|
||||
|
@ -2057,7 +2063,7 @@ class ArchiveChecker:
|
|||
i,
|
||||
)
|
||||
except IntegrityError as integrity_error:
|
||||
# key.decrypt() detected integrity issues.
|
||||
# repo_objs.parse() detected integrity issues.
|
||||
# maybe the repo gave us a valid cdata, but not for the chunk_id we wanted.
|
||||
# or the authentication of cdata failed, meaning the encrypted data was corrupted.
|
||||
report(str(integrity_error), chunk_id, i)
|
||||
|
@ -2098,7 +2104,7 @@ class ArchiveChecker:
|
|||
mark_as_possibly_superseded(archive_id)
|
||||
cdata = self.repository.get(archive_id)
|
||||
try:
|
||||
data = self.key.decrypt(archive_id, cdata)
|
||||
_, data = self.repo_objs.parse(archive_id, cdata)
|
||||
except IntegrityError as integrity_error:
|
||||
logger.error("Archive metadata block %s is corrupted: %s", bin_to_hex(archive_id), integrity_error)
|
||||
self.error_found = True
|
||||
|
@ -2114,14 +2120,18 @@ class ArchiveChecker:
|
|||
verify_file_chunks(info.name, item)
|
||||
items_buffer.add(item)
|
||||
items_buffer.flush(flush=True)
|
||||
for previous_item_id in archive_get_items(archive, self.key, self.repository):
|
||||
for previous_item_id in archive_get_items(
|
||||
archive, repo_objs=self.repo_objs, repository=self.repository
|
||||
):
|
||||
mark_as_possibly_superseded(previous_item_id)
|
||||
for previous_item_ptr in archive.item_ptrs:
|
||||
mark_as_possibly_superseded(previous_item_ptr)
|
||||
archive.item_ptrs = archive_put_items(items_buffer.chunks, key=self.key, add_reference=add_reference)
|
||||
archive.item_ptrs = archive_put_items(
|
||||
items_buffer.chunks, repo_objs=self.repo_objs, add_reference=add_reference
|
||||
)
|
||||
data = msgpack.packb(archive.as_dict())
|
||||
new_archive_id = self.key.id_hash(data)
|
||||
cdata = self.key.encrypt(new_archive_id, data)
|
||||
cdata = self.repo_objs.format(new_archive_id, {}, data)
|
||||
add_reference(new_archive_id, len(data), cdata)
|
||||
self.manifest.archives[info.name] = (new_archive_id, info.ts)
|
||||
pi.finish()
|
||||
|
@ -2162,9 +2172,7 @@ class ArchiveRecreater:
|
|||
|
||||
def __init__(
|
||||
self,
|
||||
repository,
|
||||
manifest,
|
||||
key,
|
||||
cache,
|
||||
matcher,
|
||||
exclude_caches=False,
|
||||
|
@ -2181,9 +2189,10 @@ class ArchiveRecreater:
|
|||
timestamp=None,
|
||||
checkpoint_interval=1800,
|
||||
):
|
||||
self.repository = repository
|
||||
self.key = key
|
||||
self.manifest = manifest
|
||||
self.repository = manifest.repository
|
||||
self.key = manifest.key
|
||||
self.repo_objs = manifest.repo_objs
|
||||
self.cache = cache
|
||||
|
||||
self.matcher = matcher
|
||||
|
@ -2260,12 +2269,16 @@ class ArchiveRecreater:
|
|||
overwrite = self.recompress
|
||||
if self.recompress and not self.always_recompress and chunk_id in self.cache.chunks:
|
||||
# Check if this chunk is already compressed the way we want it
|
||||
old_chunk = self.key.decrypt(chunk_id, self.repository.get(chunk_id), decompress=False)
|
||||
compressor_cls, level = Compressor.detect(old_chunk)
|
||||
if compressor_cls.name == self.key.compressor.decide(data).name and level == self.key.compressor.level:
|
||||
old_meta = self.repo_objs.parse_meta(chunk_id, self.repository.get(chunk_id, read_data=False))
|
||||
compr_hdr = bytes((old_meta["ctype"], old_meta["clevel"]))
|
||||
compressor_cls, level = Compressor.detect(compr_hdr)
|
||||
if (
|
||||
compressor_cls.name == self.repo_objs.compressor.decide({}, data).name
|
||||
and level == self.repo_objs.compressor.level
|
||||
):
|
||||
# Stored chunk has the same compression method and level as we wanted
|
||||
overwrite = False
|
||||
chunk_entry = self.cache.add_chunk(chunk_id, data, target.stats, overwrite=overwrite, wait=False)
|
||||
chunk_entry = self.cache.add_chunk(chunk_id, {}, data, stats=target.stats, overwrite=overwrite, wait=False)
|
||||
self.cache.repository.async_response(wait=False)
|
||||
self.seen_chunks.add(chunk_entry.id)
|
||||
return chunk_entry
|
||||
|
@ -2371,8 +2384,6 @@ class ArchiveRecreater:
|
|||
|
||||
def create_target_archive(self, name):
|
||||
target = Archive(
|
||||
self.repository,
|
||||
self.key,
|
||||
self.manifest,
|
||||
name,
|
||||
create=True,
|
||||
|
@ -2384,4 +2395,4 @@ class ArchiveRecreater:
|
|||
return target
|
||||
|
||||
def open_archive(self, name, **kwargs):
|
||||
return Archive(self.repository, self.key, self.manifest, name, cache=self.cache, **kwargs)
|
||||
return Archive(self.manifest, name, cache=self.cache, **kwargs)
|
||||
|
|
|
@ -14,6 +14,7 @@ from ..manifest import Manifest, AI_HUMAN_SORT_KEYS
|
|||
from ..patterns import PatternMatcher
|
||||
from ..remote import RemoteRepository
|
||||
from ..repository import Repository
|
||||
from ..repoobj import RepoObj, RepoObj1
|
||||
from ..patterns import (
|
||||
ArgparsePatternAction,
|
||||
ArgparseExcludeFileAction,
|
||||
|
@ -80,7 +81,7 @@ def with_repository(
|
|||
:param create: create repository
|
||||
:param lock: lock repository
|
||||
:param exclusive: (bool) lock repository exclusively (for writing)
|
||||
:param manifest: load manifest and key, pass them as keyword arguments
|
||||
:param manifest: load manifest and repo_objs (key), pass them as keyword arguments
|
||||
:param cache: open cache, pass it as keyword argument (implies manifest)
|
||||
:param secure: do assert_secure after loading manifest
|
||||
:param compatibility: mandatory if not create and (manifest or cache), specifies mandatory feature categories to check
|
||||
|
@ -135,16 +136,16 @@ def with_repository(
|
|||
"You can use 'borg transfer' to copy archives from old to new repos."
|
||||
)
|
||||
if manifest or cache:
|
||||
kwargs["manifest"], kwargs["key"] = Manifest.load(repository, compatibility)
|
||||
manifest_ = Manifest.load(repository, compatibility)
|
||||
kwargs["manifest"] = manifest_
|
||||
if "compression" in args:
|
||||
kwargs["key"].compressor = args.compression.compressor
|
||||
manifest_.repo_objs.compressor = args.compression.compressor
|
||||
if secure:
|
||||
assert_secure(repository, kwargs["manifest"], self.lock_wait)
|
||||
assert_secure(repository, manifest_, self.lock_wait)
|
||||
if cache:
|
||||
with Cache(
|
||||
repository,
|
||||
kwargs["key"],
|
||||
kwargs["manifest"],
|
||||
manifest_,
|
||||
progress=getattr(args, "progress", False),
|
||||
lock_wait=self.lock_wait,
|
||||
cache_mode=getattr(args, "files_cache_mode", FILES_CACHE_MODE_DISABLED),
|
||||
|
@ -160,7 +161,7 @@ def with_repository(
|
|||
return decorator
|
||||
|
||||
|
||||
def with_other_repository(manifest=False, key=False, cache=False, compatibility=None):
|
||||
def with_other_repository(manifest=False, cache=False, compatibility=None):
|
||||
"""
|
||||
this is a simplified version of "with_repository", just for the "other location".
|
||||
|
||||
|
@ -170,7 +171,7 @@ def with_other_repository(manifest=False, key=False, cache=False, compatibility=
|
|||
compatibility = compat_check(
|
||||
create=False,
|
||||
manifest=manifest,
|
||||
key=key,
|
||||
key=manifest,
|
||||
cache=cache,
|
||||
compatibility=compatibility,
|
||||
decorator_name="with_other_repository",
|
||||
|
@ -199,17 +200,16 @@ def with_other_repository(manifest=False, key=False, cache=False, compatibility=
|
|||
if repository.version not in (1, 2):
|
||||
raise Error("This borg version only accepts version 1 or 2 repos for --other-repo.")
|
||||
kwargs["other_repository"] = repository
|
||||
if manifest or key or cache:
|
||||
manifest_, key_ = Manifest.load(repository, compatibility)
|
||||
if manifest or cache:
|
||||
manifest_ = Manifest.load(
|
||||
repository, compatibility, ro_cls=RepoObj if repository.version > 1 else RepoObj1
|
||||
)
|
||||
assert_secure(repository, manifest_, self.lock_wait)
|
||||
if manifest:
|
||||
kwargs["other_manifest"] = manifest_
|
||||
if key:
|
||||
kwargs["other_key"] = key_
|
||||
if cache:
|
||||
with Cache(
|
||||
repository,
|
||||
key_,
|
||||
manifest_,
|
||||
progress=False,
|
||||
lock_wait=self.lock_wait,
|
||||
|
@ -229,12 +229,10 @@ def with_other_repository(manifest=False, key=False, cache=False, compatibility=
|
|||
|
||||
def with_archive(method):
|
||||
@functools.wraps(method)
|
||||
def wrapper(self, args, repository, key, manifest, **kwargs):
|
||||
def wrapper(self, args, repository, manifest, **kwargs):
|
||||
archive_name = getattr(args, "name", None)
|
||||
assert archive_name is not None
|
||||
archive = Archive(
|
||||
repository,
|
||||
key,
|
||||
manifest,
|
||||
archive_name,
|
||||
numeric_ids=getattr(args, "numeric_ids", False),
|
||||
|
@ -246,7 +244,7 @@ def with_archive(method):
|
|||
log_json=args.log_json,
|
||||
iec=args.iec,
|
||||
)
|
||||
return method(self, args, repository=repository, manifest=manifest, key=key, archive=archive, **kwargs)
|
||||
return method(self, args, repository=repository, manifest=manifest, archive=archive, **kwargs)
|
||||
|
||||
return wrapper
|
||||
|
||||
|
|
|
@ -109,9 +109,9 @@ class ConfigMixIn:
|
|||
name = args.name
|
||||
|
||||
if args.cache:
|
||||
manifest, key = Manifest.load(repository, (Manifest.Operation.WRITE,))
|
||||
manifest = Manifest.load(repository, (Manifest.Operation.WRITE,))
|
||||
assert_secure(repository, manifest, self.lock_wait)
|
||||
cache = Cache(repository, key, manifest, lock_wait=self.lock_wait)
|
||||
cache = Cache(repository, manifest, lock_wait=self.lock_wait)
|
||||
|
||||
try:
|
||||
if args.cache:
|
||||
|
|
|
@ -39,8 +39,9 @@ logger = create_logger()
|
|||
|
||||
class CreateMixIn:
|
||||
@with_repository(exclusive=True, compatibility=(Manifest.Operation.WRITE,))
|
||||
def do_create(self, args, repository, manifest=None, key=None):
|
||||
def do_create(self, args, repository, manifest):
|
||||
"""Create new archive"""
|
||||
key = manifest.key
|
||||
matcher = PatternMatcher(fallback=True)
|
||||
matcher.add_inclexcl(args.patterns)
|
||||
|
||||
|
@ -210,7 +211,6 @@ class CreateMixIn:
|
|||
if not dry_run:
|
||||
with Cache(
|
||||
repository,
|
||||
key,
|
||||
manifest,
|
||||
progress=args.progress,
|
||||
lock_wait=self.lock_wait,
|
||||
|
@ -219,8 +219,6 @@ class CreateMixIn:
|
|||
iec=args.iec,
|
||||
) as cache:
|
||||
archive = Archive(
|
||||
repository,
|
||||
key,
|
||||
manifest,
|
||||
args.name,
|
||||
cache=cache,
|
||||
|
|
|
@ -16,6 +16,7 @@ from ..helpers import positive_int_validator, NameSpec
|
|||
from ..manifest import Manifest
|
||||
from ..platform import get_process_id
|
||||
from ..repository import Repository, LIST_SCAN_LIMIT, TAG_PUT, TAG_DELETE, TAG_COMMIT
|
||||
from ..repoobj import RepoObj
|
||||
|
||||
from ._common import with_repository
|
||||
from ._common import process_epilog
|
||||
|
@ -29,11 +30,12 @@ class DebugMixIn:
|
|||
return EXIT_SUCCESS
|
||||
|
||||
@with_repository(compatibility=Manifest.NO_OPERATION_CHECK)
|
||||
def do_debug_dump_archive_items(self, args, repository, manifest, key):
|
||||
def do_debug_dump_archive_items(self, args, repository, manifest):
|
||||
"""dump (decrypted, decompressed) archive items metadata (not: data)"""
|
||||
archive = Archive(repository, key, manifest, args.name, consider_part_files=args.consider_part_files)
|
||||
repo_objs = manifest.repo_objs
|
||||
archive = Archive(manifest, args.name, consider_part_files=args.consider_part_files)
|
||||
for i, item_id in enumerate(archive.metadata.items):
|
||||
data = key.decrypt(item_id, repository.get(item_id))
|
||||
_, data = repo_objs.parse(item_id, repository.get(item_id))
|
||||
filename = "%06d_%s.items" % (i, bin_to_hex(item_id))
|
||||
print("Dumping", filename)
|
||||
with open(filename, "wb") as fd:
|
||||
|
@ -42,8 +44,9 @@ class DebugMixIn:
|
|||
return EXIT_SUCCESS
|
||||
|
||||
@with_repository(compatibility=Manifest.NO_OPERATION_CHECK)
|
||||
def do_debug_dump_archive(self, args, repository, manifest, key):
|
||||
def do_debug_dump_archive(self, args, repository, manifest):
|
||||
"""dump decoded archive metadata (not: data)"""
|
||||
repo_objs = manifest.repo_objs
|
||||
try:
|
||||
archive_meta_orig = manifest.archives.get_raw_dict()[args.name]
|
||||
except KeyError:
|
||||
|
@ -62,7 +65,7 @@ class DebugMixIn:
|
|||
fd.write(do_indent(prepare_dump_dict(archive_meta_orig)))
|
||||
fd.write(",\n")
|
||||
|
||||
data = key.decrypt(archive_meta_orig["id"], repository.get(archive_meta_orig["id"]))
|
||||
_, data = repo_objs.parse(archive_meta_orig["id"], repository.get(archive_meta_orig["id"]))
|
||||
archive_org_dict = msgpack.unpackb(data, object_hook=StableDict)
|
||||
|
||||
fd.write(' "_meta":\n')
|
||||
|
@ -74,10 +77,10 @@ class DebugMixIn:
|
|||
first = True
|
||||
items = []
|
||||
for chunk_id in archive_org_dict["item_ptrs"]:
|
||||
data = key.decrypt(chunk_id, repository.get(chunk_id))
|
||||
_, data = repo_objs.parse(chunk_id, repository.get(chunk_id))
|
||||
items.extend(msgpack.unpackb(data))
|
||||
for item_id in items:
|
||||
data = key.decrypt(item_id, repository.get(item_id))
|
||||
_, data = repo_objs.parse(item_id, repository.get(item_id))
|
||||
unpacker.feed(data)
|
||||
for item in unpacker:
|
||||
item = prepare_dump_dict(item)
|
||||
|
@ -95,10 +98,10 @@ class DebugMixIn:
|
|||
return EXIT_SUCCESS
|
||||
|
||||
@with_repository(compatibility=Manifest.NO_OPERATION_CHECK)
|
||||
def do_debug_dump_manifest(self, args, repository, manifest, key):
|
||||
def do_debug_dump_manifest(self, args, repository, manifest):
|
||||
"""dump decoded repository manifest"""
|
||||
|
||||
data = key.decrypt(manifest.MANIFEST_ID, repository.get(manifest.MANIFEST_ID))
|
||||
repo_objs = manifest.repo_objs
|
||||
_, data = repo_objs.parse(manifest.MANIFEST_ID, repository.get(manifest.MANIFEST_ID))
|
||||
|
||||
meta = prepare_dump_dict(msgpack.unpackb(data, object_hook=StableDict))
|
||||
|
||||
|
@ -113,9 +116,9 @@ class DebugMixIn:
|
|||
|
||||
def decrypt_dump(i, id, cdata, tag=None, segment=None, offset=None):
|
||||
if cdata is not None:
|
||||
data = key.decrypt(id, cdata)
|
||||
_, data = repo_objs.parse(id, cdata)
|
||||
else:
|
||||
data = b""
|
||||
_, data = {}, b""
|
||||
tag_str = "" if tag is None else "_" + tag
|
||||
segment_str = "_" + str(segment) if segment is not None else ""
|
||||
offset_str = "_" + str(offset) if offset is not None else ""
|
||||
|
@ -132,6 +135,7 @@ class DebugMixIn:
|
|||
for id, cdata, tag, segment, offset in repository.scan_low_level():
|
||||
if tag == TAG_PUT:
|
||||
key = key_factory(repository, cdata)
|
||||
repo_objs = RepoObj(key)
|
||||
break
|
||||
i = 0
|
||||
for id, cdata, tag, segment, offset in repository.scan_low_level(segment=args.segment, offset=args.offset):
|
||||
|
@ -147,6 +151,7 @@ class DebugMixIn:
|
|||
ids = repository.list(limit=1, marker=None)
|
||||
cdata = repository.get(ids[0])
|
||||
key = key_factory(repository, cdata)
|
||||
repo_objs = RepoObj(key)
|
||||
marker = None
|
||||
i = 0
|
||||
while True:
|
||||
|
@ -195,6 +200,7 @@ class DebugMixIn:
|
|||
ids = repository.list(limit=1, marker=None)
|
||||
cdata = repository.get(ids[0])
|
||||
key = key_factory(repository, cdata)
|
||||
repo_objs = RepoObj(key)
|
||||
|
||||
marker = None
|
||||
last_data = b""
|
||||
|
@ -207,7 +213,7 @@ class DebugMixIn:
|
|||
marker = result[-1]
|
||||
for id in result:
|
||||
cdata = repository.get(id)
|
||||
data = key.decrypt(id, cdata)
|
||||
_, data = repo_objs.parse(id, cdata)
|
||||
|
||||
# try to locate wanted sequence crossing the border of last_data and data
|
||||
boundary_data = last_data[-(len(wanted) - 1) :] + data[: len(wanted) - 1]
|
||||
|
@ -284,7 +290,7 @@ class DebugMixIn:
|
|||
return EXIT_SUCCESS
|
||||
|
||||
@with_repository(manifest=False, exclusive=True, cache=True, compatibility=Manifest.NO_OPERATION_CHECK)
|
||||
def do_debug_refcount_obj(self, args, repository, manifest, key, cache):
|
||||
def do_debug_refcount_obj(self, args, repository, manifest, cache):
|
||||
"""display refcounts for the objects with the given IDs"""
|
||||
for hex_id in args.ids:
|
||||
try:
|
||||
|
|
|
@ -19,7 +19,7 @@ class DeleteMixIn:
|
|||
"""Delete archives"""
|
||||
self.output_list = args.output_list
|
||||
dry_run = args.dry_run
|
||||
manifest, key = Manifest.load(repository, (Manifest.Operation.DELETE,))
|
||||
manifest = Manifest.load(repository, (Manifest.Operation.DELETE,))
|
||||
archive_names = tuple(x.name for x in manifest.archives.list_considering(args))
|
||||
if not archive_names:
|
||||
return self.exit_code
|
||||
|
@ -56,7 +56,7 @@ class DeleteMixIn:
|
|||
return self.exit_code
|
||||
|
||||
stats = Statistics(iec=args.iec)
|
||||
with Cache(repository, key, manifest, progress=args.progress, lock_wait=self.lock_wait, iec=args.iec) as cache:
|
||||
with Cache(repository, manifest, progress=args.progress, lock_wait=self.lock_wait, iec=args.iec) as cache:
|
||||
|
||||
def checkpoint_func():
|
||||
manifest.write()
|
||||
|
@ -80,12 +80,7 @@ class DeleteMixIn:
|
|||
|
||||
if not dry_run:
|
||||
archive = Archive(
|
||||
repository,
|
||||
key,
|
||||
manifest,
|
||||
archive_name,
|
||||
cache=cache,
|
||||
consider_part_files=args.consider_part_files,
|
||||
manifest, archive_name, cache=cache, consider_part_files=args.consider_part_files
|
||||
)
|
||||
archive.delete(stats, progress=args.progress, forced=args.forced)
|
||||
checkpointed = self.maybe_checkpoint(
|
||||
|
|
|
@ -15,7 +15,7 @@ logger = create_logger()
|
|||
class DiffMixIn:
|
||||
@with_repository(compatibility=(Manifest.Operation.READ,))
|
||||
@with_archive
|
||||
def do_diff(self, args, repository, manifest, key, archive):
|
||||
def do_diff(self, args, repository, manifest, archive):
|
||||
"""Diff contents of two archives"""
|
||||
|
||||
def print_json_output(diff, path):
|
||||
|
@ -27,7 +27,7 @@ class DiffMixIn:
|
|||
print_output = print_json_output if args.json_lines else print_text_output
|
||||
|
||||
archive1 = archive
|
||||
archive2 = Archive(repository, key, manifest, args.other_name, consider_part_files=args.consider_part_files)
|
||||
archive2 = Archive(manifest, args.other_name, consider_part_files=args.consider_part_files)
|
||||
|
||||
can_compare_chunk_ids = (
|
||||
archive1.metadata.get("chunker_params", False) == archive2.metadata.get("chunker_params", True)
|
||||
|
|
|
@ -22,7 +22,7 @@ logger = create_logger()
|
|||
class ExtractMixIn:
|
||||
@with_repository(compatibility=(Manifest.Operation.READ,))
|
||||
@with_archive
|
||||
def do_extract(self, args, repository, manifest, key, archive):
|
||||
def do_extract(self, args, repository, manifest, archive):
|
||||
"""Extract archive contents"""
|
||||
# be restrictive when restoring files, restore permissions later
|
||||
if sys.getfilesystemencoding() == "ascii":
|
||||
|
|
|
@ -16,7 +16,7 @@ logger = create_logger()
|
|||
|
||||
class InfoMixIn:
|
||||
@with_repository(cache=True, compatibility=(Manifest.Operation.READ,))
|
||||
def do_info(self, args, repository, manifest, key, cache):
|
||||
def do_info(self, args, repository, manifest, cache):
|
||||
"""Show archive details such as disk space used"""
|
||||
|
||||
def format_cmdline(cmdline):
|
||||
|
@ -29,13 +29,7 @@ class InfoMixIn:
|
|||
|
||||
for i, archive_name in enumerate(archive_names, 1):
|
||||
archive = Archive(
|
||||
repository,
|
||||
key,
|
||||
manifest,
|
||||
archive_name,
|
||||
cache=cache,
|
||||
consider_part_files=args.consider_part_files,
|
||||
iec=args.iec,
|
||||
manifest, archive_name, cache=cache, consider_part_files=args.consider_part_files, iec=args.iec
|
||||
)
|
||||
info = archive.info()
|
||||
if args.json:
|
||||
|
|
|
@ -17,8 +17,9 @@ logger = create_logger(__name__)
|
|||
|
||||
class KeysMixIn:
|
||||
@with_repository(compatibility=(Manifest.Operation.CHECK,))
|
||||
def do_change_passphrase(self, args, repository, manifest, key):
|
||||
def do_change_passphrase(self, args, repository, manifest):
|
||||
"""Change repository key file passphrase"""
|
||||
key = manifest.key
|
||||
if not hasattr(key, "change_passphrase"):
|
||||
print("This repository is not encrypted, cannot change the passphrase.")
|
||||
return EXIT_ERROR
|
||||
|
@ -30,8 +31,9 @@ class KeysMixIn:
|
|||
return EXIT_SUCCESS
|
||||
|
||||
@with_repository(exclusive=True, manifest=True, cache=True, compatibility=(Manifest.Operation.CHECK,))
|
||||
def do_change_location(self, args, repository, manifest, key, cache):
|
||||
def do_change_location(self, args, repository, manifest, cache):
|
||||
"""Change repository key location"""
|
||||
key = manifest.key
|
||||
if not hasattr(key, "change_passphrase"):
|
||||
print("This repository is not encrypted, cannot change the key location.")
|
||||
return EXIT_ERROR
|
||||
|
@ -71,6 +73,7 @@ class KeysMixIn:
|
|||
|
||||
# rewrite the manifest with the new key, so that the key-type byte of the manifest changes
|
||||
manifest.key = key_new
|
||||
manifest.repo_objs.key = key_new
|
||||
manifest.write()
|
||||
repository.commit(compact=False)
|
||||
|
||||
|
|
|
@ -16,7 +16,7 @@ logger = create_logger()
|
|||
|
||||
class ListMixIn:
|
||||
@with_repository(compatibility=(Manifest.Operation.READ,))
|
||||
def do_list(self, args, repository, manifest, key):
|
||||
def do_list(self, args, repository, manifest):
|
||||
"""List archive contents"""
|
||||
matcher = build_matcher(args.patterns, args.paths)
|
||||
if args.format is not None:
|
||||
|
@ -27,9 +27,7 @@ class ListMixIn:
|
|||
format = "{mode} {user:6} {group:6} {size:8} {mtime} {path}{extra}{NL}"
|
||||
|
||||
def _list_inner(cache):
|
||||
archive = Archive(
|
||||
repository, key, manifest, args.name, cache=cache, consider_part_files=args.consider_part_files
|
||||
)
|
||||
archive = Archive(manifest, args.name, cache=cache, consider_part_files=args.consider_part_files)
|
||||
|
||||
formatter = ItemFormatter(archive, format, json_lines=args.json_lines)
|
||||
for item in archive.iter_items(lambda item: matcher.match(item.path)):
|
||||
|
@ -37,7 +35,7 @@ class ListMixIn:
|
|||
|
||||
# Only load the cache if it will be used
|
||||
if ItemFormatter.format_needs_cache(format):
|
||||
with Cache(repository, key, manifest, lock_wait=self.lock_wait) as cache:
|
||||
with Cache(repository, manifest, lock_wait=self.lock_wait) as cache:
|
||||
_list_inner(cache)
|
||||
else:
|
||||
_list_inner(cache=None)
|
||||
|
|
|
@ -31,11 +31,11 @@ class MountMixIn:
|
|||
return self._do_mount(args)
|
||||
|
||||
@with_repository(compatibility=(Manifest.Operation.READ,))
|
||||
def _do_mount(self, args, repository, manifest, key):
|
||||
def _do_mount(self, args, repository, manifest):
|
||||
from ..fuse import FuseOperations
|
||||
|
||||
with cache_if_remote(repository, decrypted_cache=key) as cached_repo:
|
||||
operations = FuseOperations(key, repository, manifest, args, cached_repo)
|
||||
with cache_if_remote(repository, decrypted_cache=manifest.repo_objs) as cached_repo:
|
||||
operations = FuseOperations(manifest, args, cached_repo)
|
||||
logger.info("Mounting filesystem")
|
||||
try:
|
||||
operations.mount(args.mountpoint, args.options, args.foreground)
|
||||
|
|
|
@ -71,7 +71,7 @@ def prune_split(archives, rule, n, kept_because=None):
|
|||
|
||||
class PruneMixIn:
|
||||
@with_repository(exclusive=True, compatibility=(Manifest.Operation.DELETE,))
|
||||
def do_prune(self, args, repository, manifest, key):
|
||||
def do_prune(self, args, repository, manifest):
|
||||
"""Prune repository archives according to specified rules"""
|
||||
if not any(
|
||||
(args.secondly, args.minutely, args.hourly, args.daily, args.weekly, args.monthly, args.yearly, args.within)
|
||||
|
@ -119,7 +119,7 @@ class PruneMixIn:
|
|||
|
||||
to_delete = (set(archives) | checkpoints) - (set(keep) | set(keep_checkpoints))
|
||||
stats = Statistics(iec=args.iec)
|
||||
with Cache(repository, key, manifest, lock_wait=self.lock_wait, iec=args.iec) as cache:
|
||||
with Cache(repository, manifest, lock_wait=self.lock_wait, iec=args.iec) as cache:
|
||||
|
||||
def checkpoint_func():
|
||||
manifest.write()
|
||||
|
@ -142,9 +142,7 @@ class PruneMixIn:
|
|||
else:
|
||||
archives_deleted += 1
|
||||
log_message = "Pruning archive (%d/%d):" % (archives_deleted, to_delete_len)
|
||||
archive = Archive(
|
||||
repository, key, manifest, archive.name, cache, consider_part_files=args.consider_part_files
|
||||
)
|
||||
archive = Archive(manifest, archive.name, cache, consider_part_files=args.consider_part_files)
|
||||
archive.delete(stats, forced=args.forced)
|
||||
checkpointed = self.maybe_checkpoint(
|
||||
checkpoint_func=checkpoint_func, checkpoint_interval=args.checkpoint_interval
|
||||
|
|
|
@ -16,9 +16,10 @@ logger = create_logger()
|
|||
|
||||
class RCreateMixIn:
|
||||
@with_repository(create=True, exclusive=True, manifest=False)
|
||||
@with_other_repository(key=True, compatibility=(Manifest.Operation.READ,))
|
||||
def do_rcreate(self, args, repository, *, other_repository=None, other_key=None):
|
||||
@with_other_repository(manifest=True, compatibility=(Manifest.Operation.READ,))
|
||||
def do_rcreate(self, args, repository, *, other_repository=None, other_manifest=None):
|
||||
"""Create a new, empty repository"""
|
||||
other_key = other_manifest.key if other_manifest is not None else None
|
||||
path = args.location.canonical_path()
|
||||
logger.info('Initializing repository at "%s"' % path)
|
||||
if other_key is not None:
|
||||
|
@ -32,7 +33,7 @@ class RCreateMixIn:
|
|||
manifest.key = key
|
||||
manifest.write()
|
||||
repository.commit(compact=False)
|
||||
with Cache(repository, key, manifest, warn_if_unencrypted=False):
|
||||
with Cache(repository, manifest, warn_if_unencrypted=False):
|
||||
pass
|
||||
if key.tam_required:
|
||||
tam_file = tam_required_file(repository)
|
||||
|
|
|
@ -28,7 +28,7 @@ class RDeleteMixIn:
|
|||
location = repository._location.canonical_path()
|
||||
msg = []
|
||||
try:
|
||||
manifest, key = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
manifest = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
n_archives = len(manifest.archives)
|
||||
msg.append(
|
||||
f"You requested to completely DELETE the following repository "
|
||||
|
|
|
@ -17,7 +17,7 @@ logger = create_logger()
|
|||
|
||||
class RecreateMixIn:
|
||||
@with_repository(cache=True, exclusive=True, compatibility=(Manifest.Operation.CHECK,))
|
||||
def do_recreate(self, args, repository, manifest, key, cache):
|
||||
def do_recreate(self, args, repository, manifest, cache):
|
||||
"""Re-create archives"""
|
||||
matcher = build_matcher(args.patterns, args.paths)
|
||||
self.output_list = args.output_list
|
||||
|
@ -26,9 +26,7 @@ class RecreateMixIn:
|
|||
always_recompress = args.recompress == "always"
|
||||
|
||||
recreater = ArchiveRecreater(
|
||||
repository,
|
||||
manifest,
|
||||
key,
|
||||
cache,
|
||||
matcher,
|
||||
exclude_caches=args.exclude_caches,
|
||||
|
|
|
@ -13,7 +13,7 @@ logger = create_logger()
|
|||
class RenameMixIn:
|
||||
@with_repository(exclusive=True, cache=True, compatibility=(Manifest.Operation.CHECK,))
|
||||
@with_archive
|
||||
def do_rename(self, args, repository, manifest, key, cache, archive):
|
||||
def do_rename(self, args, repository, manifest, cache, archive):
|
||||
"""Rename an existing archive"""
|
||||
archive.rename(args.newname)
|
||||
manifest.write()
|
||||
|
|
|
@ -13,8 +13,9 @@ logger = create_logger()
|
|||
|
||||
class RInfoMixIn:
|
||||
@with_repository(cache=True, compatibility=(Manifest.Operation.READ,))
|
||||
def do_rinfo(self, args, repository, manifest, key, cache):
|
||||
def do_rinfo(self, args, repository, manifest, cache):
|
||||
"""Show repository infos"""
|
||||
key = manifest.key
|
||||
info = basic_json_data(manifest, cache=cache, extra={"security_dir": cache.security_manager.dir})
|
||||
|
||||
if args.json:
|
||||
|
|
|
@ -14,7 +14,7 @@ logger = create_logger()
|
|||
|
||||
class RListMixIn:
|
||||
@with_repository(compatibility=(Manifest.Operation.READ,))
|
||||
def do_rlist(self, args, repository, manifest, key):
|
||||
def do_rlist(self, args, repository, manifest):
|
||||
"""List the archives contained in a repository"""
|
||||
if args.format is not None:
|
||||
format = args.format
|
||||
|
@ -22,7 +22,7 @@ class RListMixIn:
|
|||
format = "{archive}{NL}"
|
||||
else:
|
||||
format = "{archive:<36} {time} [{id}]{NL}"
|
||||
formatter = ArchiveFormatter(format, repository, manifest, key, json=args.json, iec=args.iec)
|
||||
formatter = ArchiveFormatter(format, repository, manifest, manifest.key, json=args.json, iec=args.iec)
|
||||
|
||||
output_data = []
|
||||
|
||||
|
|
|
@ -53,7 +53,7 @@ def get_tar_filter(fname, decompress):
|
|||
class TarMixIn:
|
||||
@with_repository(compatibility=(Manifest.Operation.READ,))
|
||||
@with_archive
|
||||
def do_export_tar(self, args, repository, manifest, key, archive):
|
||||
def do_export_tar(self, args, repository, manifest, archive):
|
||||
"""Export archive contents as a tarball"""
|
||||
self.output_list = args.output_list
|
||||
|
||||
|
@ -239,7 +239,7 @@ class TarMixIn:
|
|||
return self.exit_code
|
||||
|
||||
@with_repository(cache=True, exclusive=True, compatibility=(Manifest.Operation.WRITE,))
|
||||
def do_import_tar(self, args, repository, manifest, key, cache):
|
||||
def do_import_tar(self, args, repository, manifest, cache):
|
||||
"""Create a backup archive from a tarball"""
|
||||
self.output_filter = args.output_filter
|
||||
self.output_list = args.output_list
|
||||
|
@ -250,7 +250,7 @@ class TarMixIn:
|
|||
tarstream_close = args.tarfile != "-"
|
||||
|
||||
with create_filter_process(filter, stream=tarstream, stream_close=tarstream_close, inbound=True) as _stream:
|
||||
self._import_tar(args, repository, manifest, key, cache, _stream)
|
||||
self._import_tar(args, repository, manifest, manifest.key, cache, _stream)
|
||||
|
||||
return self.exit_code
|
||||
|
||||
|
@ -259,8 +259,6 @@ class TarMixIn:
|
|||
t0_monotonic = time.monotonic()
|
||||
|
||||
archive = Archive(
|
||||
repository,
|
||||
key,
|
||||
manifest,
|
||||
args.name,
|
||||
cache=cache,
|
||||
|
|
|
@ -15,12 +15,12 @@ logger = create_logger()
|
|||
|
||||
|
||||
class TransferMixIn:
|
||||
@with_other_repository(manifest=True, key=True, compatibility=(Manifest.Operation.READ,))
|
||||
@with_other_repository(manifest=True, compatibility=(Manifest.Operation.READ,))
|
||||
@with_repository(exclusive=True, manifest=True, cache=True, compatibility=(Manifest.Operation.WRITE,))
|
||||
def do_transfer(
|
||||
self, args, *, repository, manifest, key, cache, other_repository=None, other_manifest=None, other_key=None
|
||||
):
|
||||
def do_transfer(self, args, *, repository, manifest, cache, other_repository=None, other_manifest=None):
|
||||
"""archives transfer from other repository, optionally upgrade data format"""
|
||||
key = manifest.key
|
||||
other_key = other_manifest.key
|
||||
if not uses_same_id_hash(other_key, key):
|
||||
self.print_error(
|
||||
"You must keep the same ID hash ([HMAC-]SHA256 or BLAKE2b) or deduplication will break. "
|
||||
|
@ -57,8 +57,8 @@ class TransferMixIn:
|
|||
else:
|
||||
if not dry_run:
|
||||
print(f"{name}: copying archive to destination repo...")
|
||||
other_archive = Archive(other_repository, other_key, other_manifest, name)
|
||||
archive = Archive(repository, key, manifest, name, cache=cache, create=True) if not dry_run else None
|
||||
other_archive = Archive(other_manifest, name)
|
||||
archive = Archive(manifest, name, cache=cache, create=True) if not dry_run else None
|
||||
upgrader.new_archive(archive=archive)
|
||||
for item in other_archive.iter_items():
|
||||
if "chunks" in item:
|
||||
|
@ -69,10 +69,18 @@ class TransferMixIn:
|
|||
if not dry_run:
|
||||
cdata = other_repository.get(chunk_id)
|
||||
# keep compressed payload same, avoid decompression / recompression
|
||||
data = other_key.decrypt(chunk_id, cdata, decompress=False)
|
||||
data = upgrader.upgrade_compressed_chunk(chunk=data)
|
||||
meta, data = other_manifest.repo_objs.parse(chunk_id, cdata, decompress=False)
|
||||
meta, data = upgrader.upgrade_compressed_chunk(meta, data)
|
||||
chunk_entry = cache.add_chunk(
|
||||
chunk_id, data, archive.stats, wait=False, compress=False, size=size
|
||||
chunk_id,
|
||||
meta,
|
||||
data,
|
||||
stats=archive.stats,
|
||||
wait=False,
|
||||
compress=False,
|
||||
size=size,
|
||||
ctype=meta["ctype"],
|
||||
clevel=meta["clevel"],
|
||||
)
|
||||
cache.repository.async_response(wait=False)
|
||||
chunks.append(chunk_entry)
|
||||
|
|
|
@ -396,7 +396,6 @@ class Cache:
|
|||
def __new__(
|
||||
cls,
|
||||
repository,
|
||||
key,
|
||||
manifest,
|
||||
path=None,
|
||||
sync=True,
|
||||
|
@ -410,8 +409,6 @@ class Cache:
|
|||
):
|
||||
def local():
|
||||
return LocalCache(
|
||||
repository=repository,
|
||||
key=key,
|
||||
manifest=manifest,
|
||||
path=path,
|
||||
sync=sync,
|
||||
|
@ -424,14 +421,7 @@ class Cache:
|
|||
)
|
||||
|
||||
def adhoc():
|
||||
return AdHocCache(
|
||||
repository=repository,
|
||||
key=key,
|
||||
manifest=manifest,
|
||||
lock_wait=lock_wait,
|
||||
iec=iec,
|
||||
consider_part_files=consider_part_files,
|
||||
)
|
||||
return AdHocCache(manifest=manifest, lock_wait=lock_wait, iec=iec, consider_part_files=consider_part_files)
|
||||
|
||||
if not permit_adhoc_cache:
|
||||
return local()
|
||||
|
@ -481,9 +471,7 @@ Total chunks: {0.total_chunks}
|
|||
# so we can just sum up all archives to get the "all archives" stats:
|
||||
total_size = 0
|
||||
for archive_name in self.manifest.archives:
|
||||
archive = Archive(
|
||||
self.repository, self.key, self.manifest, archive_name, consider_part_files=self.consider_part_files
|
||||
)
|
||||
archive = Archive(self.manifest, archive_name, consider_part_files=self.consider_part_files)
|
||||
stats = archive.calc_stats(self, want_unique=False)
|
||||
total_size += stats.osize
|
||||
stats = self.Summary(total_size, unique_size, total_unique_chunks, total_chunks)._asdict()
|
||||
|
@ -503,8 +491,6 @@ class LocalCache(CacheStatsMixin):
|
|||
|
||||
def __init__(
|
||||
self,
|
||||
repository,
|
||||
key,
|
||||
manifest,
|
||||
path=None,
|
||||
sync=True,
|
||||
|
@ -522,27 +508,29 @@ class LocalCache(CacheStatsMixin):
|
|||
:param cache_mode: what shall be compared in the file stat infos vs. cached stat infos comparison
|
||||
"""
|
||||
CacheStatsMixin.__init__(self, iec=iec)
|
||||
self.repository = repository
|
||||
self.key = key
|
||||
assert isinstance(manifest, Manifest)
|
||||
self.manifest = manifest
|
||||
self.repository = manifest.repository
|
||||
self.key = manifest.key
|
||||
self.repo_objs = manifest.repo_objs
|
||||
self.progress = progress
|
||||
self.cache_mode = cache_mode
|
||||
self.consider_part_files = consider_part_files
|
||||
self.timestamp = None
|
||||
self.txn_active = False
|
||||
|
||||
self.path = cache_dir(repository, path)
|
||||
self.security_manager = SecurityManager(repository)
|
||||
self.path = cache_dir(self.repository, path)
|
||||
self.security_manager = SecurityManager(self.repository)
|
||||
self.cache_config = CacheConfig(self.repository, self.path, lock_wait)
|
||||
|
||||
# Warn user before sending data to a never seen before unencrypted repository
|
||||
if not os.path.exists(self.path):
|
||||
self.security_manager.assert_access_unknown(warn_if_unencrypted, manifest, key)
|
||||
self.security_manager.assert_access_unknown(warn_if_unencrypted, manifest, self.key)
|
||||
self.create()
|
||||
|
||||
self.open()
|
||||
try:
|
||||
self.security_manager.assert_secure(manifest, key, cache_config=self.cache_config)
|
||||
self.security_manager.assert_secure(manifest, self.key, cache_config=self.cache_config)
|
||||
|
||||
if not self.check_cache_compatibility():
|
||||
self.wipe_cache()
|
||||
|
@ -912,7 +900,7 @@ class LocalCache(CacheStatsMixin):
|
|||
self.manifest.check_repository_compatibility((Manifest.Operation.READ,))
|
||||
|
||||
self.begin_txn()
|
||||
with cache_if_remote(self.repository, decrypted_cache=self.key) as decrypted_repository:
|
||||
with cache_if_remote(self.repository, decrypted_cache=self.repo_objs) as decrypted_repository:
|
||||
# TEMPORARY HACK: to avoid archive index caching, create a FILE named ~/.cache/borg/REPOID/chunks.archive.d -
|
||||
# this is only recommended if you have a fast, low latency connection to your repo (e.g. if repo is local disk)
|
||||
self.do_cache = os.path.isdir(archive_path)
|
||||
|
@ -955,18 +943,20 @@ class LocalCache(CacheStatsMixin):
|
|||
self.cache_config.ignored_features.update(repo_features - my_features)
|
||||
self.cache_config.mandatory_features.update(repo_features & my_features)
|
||||
|
||||
def add_chunk(self, id, chunk, stats, *, overwrite=False, wait=True, compress=True, size=None):
|
||||
def add_chunk(
|
||||
self, id, meta, data, *, stats, overwrite=False, wait=True, compress=True, size=None, ctype=None, clevel=None
|
||||
):
|
||||
if not self.txn_active:
|
||||
self.begin_txn()
|
||||
if size is None and compress:
|
||||
size = len(chunk) # chunk is still uncompressed
|
||||
size = len(data) # data is still uncompressed
|
||||
refcount = self.seen_chunk(id, size)
|
||||
if refcount and not overwrite:
|
||||
return self.chunk_incref(id, stats)
|
||||
if size is None:
|
||||
raise ValueError("when giving compressed data for a new chunk, the uncompressed size must be given also")
|
||||
data = self.key.encrypt(id, chunk, compress=compress)
|
||||
self.repository.put(id, data, wait=wait)
|
||||
cdata = self.repo_objs.format(id, meta, data, compress=compress, size=size, ctype=ctype, clevel=clevel)
|
||||
self.repository.put(id, cdata, wait=wait)
|
||||
self.chunks.add(id, 1, size)
|
||||
stats.update(size, not refcount)
|
||||
return ChunkListEntry(id, size)
|
||||
|
@ -1094,18 +1084,18 @@ All archives: unknown unknown unknown
|
|||
Unique chunks Total chunks
|
||||
Chunk index: {0.total_unique_chunks:20d} unknown"""
|
||||
|
||||
def __init__(
|
||||
self, repository, key, manifest, warn_if_unencrypted=True, lock_wait=None, consider_part_files=False, iec=False
|
||||
):
|
||||
def __init__(self, manifest, warn_if_unencrypted=True, lock_wait=None, consider_part_files=False, iec=False):
|
||||
CacheStatsMixin.__init__(self, iec=iec)
|
||||
self.repository = repository
|
||||
self.key = key
|
||||
assert isinstance(manifest, Manifest)
|
||||
self.manifest = manifest
|
||||
self.repository = manifest.repository
|
||||
self.key = manifest.key
|
||||
self.repo_objs = manifest.repo_objs
|
||||
self.consider_part_files = consider_part_files
|
||||
self._txn_active = False
|
||||
|
||||
self.security_manager = SecurityManager(repository)
|
||||
self.security_manager.assert_secure(manifest, key, lock_wait=lock_wait)
|
||||
self.security_manager = SecurityManager(self.repository)
|
||||
self.security_manager.assert_secure(manifest, self.key, lock_wait=lock_wait)
|
||||
|
||||
logger.warning("Note: --no-cache-sync is an experimental feature.")
|
||||
|
||||
|
@ -1127,19 +1117,19 @@ Chunk index: {0.total_unique_chunks:20d} unknown"""
|
|||
def memorize_file(self, hashed_path, path_hash, st, ids):
|
||||
pass
|
||||
|
||||
def add_chunk(self, id, chunk, stats, *, overwrite=False, wait=True, compress=True, size=None):
|
||||
def add_chunk(self, id, meta, data, *, stats, overwrite=False, wait=True, compress=True, size=None):
|
||||
assert not overwrite, "AdHocCache does not permit overwrites — trying to use it for recreate?"
|
||||
if not self._txn_active:
|
||||
self.begin_txn()
|
||||
if size is None and compress:
|
||||
size = len(chunk) # chunk is still uncompressed
|
||||
size = len(data) # data is still uncompressed
|
||||
if size is None:
|
||||
raise ValueError("when giving compressed data for a chunk, the uncompressed size must be given also")
|
||||
refcount = self.seen_chunk(id, size)
|
||||
if refcount:
|
||||
return self.chunk_incref(id, stats, size=size)
|
||||
data = self.key.encrypt(id, chunk, compress=compress)
|
||||
self.repository.put(id, data, wait=wait)
|
||||
cdata = self.repo_objs.format(id, meta, data, compress=compress)
|
||||
self.repository.put(id, cdata, wait=wait)
|
||||
self.chunks.add(id, 1, size)
|
||||
stats.update(size, not refcount)
|
||||
return ChunkListEntry(id, size)
|
||||
|
|
|
@ -56,22 +56,18 @@ cdef class CompressorBase:
|
|||
also handles compression format auto detection and
|
||||
adding/stripping the ID header (which enable auto detection).
|
||||
"""
|
||||
ID = b'\xFF' # reserved and not used
|
||||
# overwrite with a unique 1-byte bytestring in child classes
|
||||
ID = 0xFF # reserved and not used
|
||||
# overwrite with a unique 1-byte bytestring in child classes
|
||||
name = 'baseclass'
|
||||
|
||||
@classmethod
|
||||
def detect(cls, data):
|
||||
return data.startswith(cls.ID)
|
||||
return data and data[0] == cls.ID
|
||||
|
||||
def __init__(self, level=255, **kwargs):
|
||||
def __init__(self, level=255, legacy_mode=False, **kwargs):
|
||||
assert 0 <= level <= 255
|
||||
self.level = level
|
||||
if self.ID is not None:
|
||||
self.id_level = self.ID + bytes((level, )) # level 255 means "unknown level"
|
||||
assert len(self.id_level) == 2
|
||||
else:
|
||||
self.id_level = None
|
||||
self.legacy_mode = legacy_mode # True: support prefixed ctype/clevel bytes
|
||||
|
||||
def decide(self, data):
|
||||
"""
|
||||
|
@ -86,24 +82,48 @@ cdef class CompressorBase:
|
|||
"""
|
||||
return self
|
||||
|
||||
def compress(self, data):
|
||||
def compress(self, meta, data):
|
||||
"""
|
||||
Compress *data* (bytes) and return bytes result. Prepend the ID bytes of this compressor,
|
||||
which is needed so that the correct decompressor can be used for decompression.
|
||||
Compress *data* (bytes) and return compression metadata and compressed bytes.
|
||||
"""
|
||||
# add id_level bytes
|
||||
return self.id_level + data
|
||||
if self.legacy_mode:
|
||||
return None, bytes((self.ID, self.level)) + data
|
||||
else:
|
||||
meta["ctype"] = self.ID
|
||||
meta["clevel"] = self.level
|
||||
meta["csize"] = len(data)
|
||||
return meta, data
|
||||
|
||||
def decompress(self, data):
|
||||
def decompress(self, meta, data):
|
||||
"""
|
||||
Decompress *data* (preferably a memoryview, bytes also acceptable) and return bytes result.
|
||||
The leading Compressor ID bytes need to be present.
|
||||
|
||||
Legacy mode: The leading Compressor ID bytes need to be present.
|
||||
|
||||
Only handles input generated by _this_ Compressor - for a general purpose
|
||||
decompression method see *Compressor.decompress*.
|
||||
"""
|
||||
# strip id_level bytes
|
||||
return data[2:]
|
||||
if self.legacy_mode:
|
||||
assert meta is None
|
||||
meta = {}
|
||||
meta["ctype"] = data[0]
|
||||
meta["clevel"] = data[1]
|
||||
meta["csize"] = len(data)
|
||||
return meta, data[2:]
|
||||
else:
|
||||
assert isinstance(meta, dict)
|
||||
assert "ctype" in meta
|
||||
assert "clevel" in meta
|
||||
return meta, data
|
||||
|
||||
def check_fix_size(self, meta, data):
|
||||
if "size" in meta:
|
||||
assert meta["size"] == len(data)
|
||||
elif self.legacy_mode:
|
||||
meta["size"] = len(data)
|
||||
else:
|
||||
pass # raise ValueError("size not present and not in legacy mode")
|
||||
|
||||
|
||||
cdef class DecidingCompressor(CompressorBase):
|
||||
"""
|
||||
|
@ -112,12 +132,12 @@ cdef class DecidingCompressor(CompressorBase):
|
|||
"""
|
||||
name = 'decidebaseclass'
|
||||
|
||||
def __init__(self, level=255, **kwargs):
|
||||
super().__init__(level=level, **kwargs)
|
||||
def __init__(self, level=255, legacy_mode=False, **kwargs):
|
||||
super().__init__(level=level, legacy_mode=legacy_mode, **kwargs)
|
||||
|
||||
def _decide(self, data):
|
||||
def _decide(self, meta, data):
|
||||
"""
|
||||
Decides what to do with *data*. Returns (compressor, compressed_data).
|
||||
Decides what to do with *data*. Returns (compressor, meta, compressed_data).
|
||||
|
||||
*compressed_data* can be the result of *data* being processed by *compressor*,
|
||||
if that is generated as a side-effect of the decision process, or None otherwise.
|
||||
|
@ -127,47 +147,50 @@ cdef class DecidingCompressor(CompressorBase):
|
|||
"""
|
||||
raise NotImplementedError
|
||||
|
||||
def decide(self, data):
|
||||
return self._decide(data)[0]
|
||||
def decide(self, meta, data):
|
||||
return self._decide(meta, data)[0]
|
||||
|
||||
def decide_compress(self, data):
|
||||
def decide_compress(self, meta, data):
|
||||
"""
|
||||
Decides what to do with *data* and handle accordingly. Returns (compressor, compressed_data).
|
||||
|
||||
*compressed_data* is the result of *data* being processed by *compressor*.
|
||||
"""
|
||||
compressor, compressed_data = self._decide(data)
|
||||
compressor, (meta, compressed_data) = self._decide(meta, data)
|
||||
|
||||
if compressed_data is None:
|
||||
compressed_data = compressor.compress(data)
|
||||
meta, compressed_data = compressor.compress(meta, data)
|
||||
|
||||
if compressor is self:
|
||||
# call super class to add ID bytes
|
||||
return self, super().compress(compressed_data)
|
||||
return self, super().compress(meta, compressed_data)
|
||||
|
||||
return compressor, compressed_data
|
||||
return compressor, (meta, compressed_data)
|
||||
|
||||
def compress(self, data):
|
||||
return self.decide_compress(data)[1]
|
||||
def compress(self, meta, data):
|
||||
meta["size"] = len(data)
|
||||
return self.decide_compress(meta, data)[1]
|
||||
|
||||
class CNONE(CompressorBase):
|
||||
"""
|
||||
none - no compression, just pass through data
|
||||
"""
|
||||
ID = b'\x00'
|
||||
ID = 0x00
|
||||
name = 'none'
|
||||
|
||||
def __init__(self, level=255, **kwargs):
|
||||
super().__init__(level=level, **kwargs) # no defined levels for CNONE, so just say "unknown"
|
||||
def __init__(self, level=255, legacy_mode=False, **kwargs):
|
||||
super().__init__(level=level, legacy_mode=legacy_mode, **kwargs) # no defined levels for CNONE, so just say "unknown"
|
||||
|
||||
def compress(self, data):
|
||||
return super().compress(data)
|
||||
def compress(self, meta, data):
|
||||
meta["size"] = len(data)
|
||||
return super().compress(meta, data)
|
||||
|
||||
def decompress(self, data):
|
||||
data = super().decompress(data)
|
||||
def decompress(self, meta, data):
|
||||
meta, data = super().decompress(meta, data)
|
||||
if not isinstance(data, bytes):
|
||||
data = bytes(data)
|
||||
return data
|
||||
self.check_fix_size(meta, data)
|
||||
return meta, data
|
||||
|
||||
|
||||
class LZ4(DecidingCompressor):
|
||||
|
@ -179,13 +202,13 @@ class LZ4(DecidingCompressor):
|
|||
- wrapper releases CPython's GIL to support multithreaded code
|
||||
- uses safe lz4 methods that never go beyond the end of the output buffer
|
||||
"""
|
||||
ID = b'\x01'
|
||||
ID = 0x01
|
||||
name = 'lz4'
|
||||
|
||||
def __init__(self, level=255, **kwargs):
|
||||
super().__init__(level=level, **kwargs) # no defined levels for LZ4, so just say "unknown"
|
||||
def __init__(self, level=255, legacy_mode=False, **kwargs):
|
||||
super().__init__(level=level, legacy_mode=legacy_mode, **kwargs) # no defined levels for LZ4, so just say "unknown"
|
||||
|
||||
def _decide(self, idata):
|
||||
def _decide(self, meta, idata):
|
||||
"""
|
||||
Decides what to do with *data*. Returns (compressor, lz4_data).
|
||||
|
||||
|
@ -206,12 +229,12 @@ class LZ4(DecidingCompressor):
|
|||
raise Exception('lz4 compress failed')
|
||||
# only compress if the result actually is smaller
|
||||
if osize < isize:
|
||||
return self, dest[:osize]
|
||||
return self, (meta, dest[:osize])
|
||||
else:
|
||||
return NONE_COMPRESSOR, None
|
||||
return NONE_COMPRESSOR, (meta, None)
|
||||
|
||||
def decompress(self, idata):
|
||||
idata = super().decompress(idata)
|
||||
def decompress(self, meta, data):
|
||||
meta, idata = super().decompress(meta, data)
|
||||
if not isinstance(idata, bytes):
|
||||
idata = bytes(idata) # code below does not work with memoryview
|
||||
cdef int isize = len(idata)
|
||||
|
@ -237,23 +260,25 @@ class LZ4(DecidingCompressor):
|
|||
raise DecompressionError('lz4 decompress failed')
|
||||
# likely the buffer was too small, get a bigger one:
|
||||
osize = int(1.5 * osize)
|
||||
return dest[:rsize]
|
||||
data = dest[:rsize]
|
||||
self.check_fix_size(meta, data)
|
||||
return meta, data
|
||||
|
||||
|
||||
class LZMA(DecidingCompressor):
|
||||
"""
|
||||
lzma compression / decompression
|
||||
"""
|
||||
ID = b'\x02'
|
||||
ID = 0x02
|
||||
name = 'lzma'
|
||||
|
||||
def __init__(self, level=6, **kwargs):
|
||||
super().__init__(level=level, **kwargs)
|
||||
def __init__(self, level=6, legacy_mode=False, **kwargs):
|
||||
super().__init__(level=level, legacy_mode=legacy_mode, **kwargs)
|
||||
self.level = level
|
||||
if lzma is None:
|
||||
raise ValueError('No lzma support found.')
|
||||
|
||||
def _decide(self, data):
|
||||
def _decide(self, meta, data):
|
||||
"""
|
||||
Decides what to do with *data*. Returns (compressor, lzma_data).
|
||||
|
||||
|
@ -262,14 +287,16 @@ class LZMA(DecidingCompressor):
|
|||
# we do not need integrity checks in lzma, we do that already
|
||||
lzma_data = lzma.compress(data, preset=self.level, check=lzma.CHECK_NONE)
|
||||
if len(lzma_data) < len(data):
|
||||
return self, lzma_data
|
||||
return self, (meta, lzma_data)
|
||||
else:
|
||||
return NONE_COMPRESSOR, None
|
||||
return NONE_COMPRESSOR, (meta, None)
|
||||
|
||||
def decompress(self, data):
|
||||
data = super().decompress(data)
|
||||
def decompress(self, meta, data):
|
||||
meta, data = super().decompress(meta, data)
|
||||
try:
|
||||
return lzma.decompress(data)
|
||||
data = lzma.decompress(data)
|
||||
self.check_fix_size(meta, data)
|
||||
return meta, data
|
||||
except lzma.LZMAError as e:
|
||||
raise DecompressionError(str(e)) from None
|
||||
|
||||
|
@ -279,14 +306,14 @@ class ZSTD(DecidingCompressor):
|
|||
# This is a NOT THREAD SAFE implementation.
|
||||
# Only ONE python context must be created at a time.
|
||||
# It should work flawlessly as long as borg will call ONLY ONE compression job at time.
|
||||
ID = b'\x03'
|
||||
ID = 0x03
|
||||
name = 'zstd'
|
||||
|
||||
def __init__(self, level=3, **kwargs):
|
||||
super().__init__(level=level, **kwargs)
|
||||
def __init__(self, level=3, legacy_mode=False, **kwargs):
|
||||
super().__init__(level=level, legacy_mode=legacy_mode, **kwargs)
|
||||
self.level = level
|
||||
|
||||
def _decide(self, idata):
|
||||
def _decide(self, meta, idata):
|
||||
"""
|
||||
Decides what to do with *data*. Returns (compressor, zstd_data).
|
||||
|
||||
|
@ -308,12 +335,12 @@ class ZSTD(DecidingCompressor):
|
|||
raise Exception('zstd compress failed: %s' % ZSTD_getErrorName(osize))
|
||||
# only compress if the result actually is smaller
|
||||
if osize < isize:
|
||||
return self, dest[:osize]
|
||||
return self, (meta, dest[:osize])
|
||||
else:
|
||||
return NONE_COMPRESSOR, None
|
||||
return NONE_COMPRESSOR, (meta, None)
|
||||
|
||||
def decompress(self, idata):
|
||||
idata = super().decompress(idata)
|
||||
def decompress(self, meta, data):
|
||||
meta, idata = super().decompress(meta, data)
|
||||
if not isinstance(idata, bytes):
|
||||
idata = bytes(idata) # code below does not work with memoryview
|
||||
cdef int isize = len(idata)
|
||||
|
@ -337,21 +364,23 @@ class ZSTD(DecidingCompressor):
|
|||
raise DecompressionError('zstd decompress failed: %s' % ZSTD_getErrorName(rsize))
|
||||
if rsize != osize:
|
||||
raise DecompressionError('zstd decompress failed: size mismatch')
|
||||
return dest[:osize]
|
||||
data = dest[:osize]
|
||||
self.check_fix_size(meta, data)
|
||||
return meta, data
|
||||
|
||||
|
||||
class ZLIB(DecidingCompressor):
|
||||
"""
|
||||
zlib compression / decompression (python stdlib)
|
||||
"""
|
||||
ID = b'\x05'
|
||||
ID = 0x05
|
||||
name = 'zlib'
|
||||
|
||||
def __init__(self, level=6, **kwargs):
|
||||
super().__init__(level=level, **kwargs)
|
||||
def __init__(self, level=6, legacy_mode=False, **kwargs):
|
||||
super().__init__(level=level, legacy_mode=legacy_mode, **kwargs)
|
||||
self.level = level
|
||||
|
||||
def _decide(self, data):
|
||||
def _decide(self, meta, data):
|
||||
"""
|
||||
Decides what to do with *data*. Returns (compressor, zlib_data).
|
||||
|
||||
|
@ -359,14 +388,16 @@ class ZLIB(DecidingCompressor):
|
|||
"""
|
||||
zlib_data = zlib.compress(data, self.level)
|
||||
if len(zlib_data) < len(data):
|
||||
return self, zlib_data
|
||||
return self, (meta, zlib_data)
|
||||
else:
|
||||
return NONE_COMPRESSOR, None
|
||||
return NONE_COMPRESSOR, (meta, None)
|
||||
|
||||
def decompress(self, data):
|
||||
data = super().decompress(data)
|
||||
def decompress(self, meta, data):
|
||||
meta, data = super().decompress(meta, data)
|
||||
try:
|
||||
return zlib.decompress(data)
|
||||
data = zlib.decompress(data)
|
||||
self.check_fix_size(meta, data)
|
||||
return meta, data
|
||||
except zlib.error as e:
|
||||
raise DecompressionError(str(e)) from None
|
||||
|
||||
|
@ -382,7 +413,7 @@ class ZLIB_legacy(CompressorBase):
|
|||
Newer borg uses the ZLIB class that has separate ID bytes (as all the other
|
||||
compressors) and does not need this hack.
|
||||
"""
|
||||
ID = b'\x08' # not used here, see detect()
|
||||
ID = 0x08 # not used here, see detect()
|
||||
# avoid all 0x.8 IDs elsewhere!
|
||||
name = 'zlib_legacy'
|
||||
|
||||
|
@ -398,14 +429,14 @@ class ZLIB_legacy(CompressorBase):
|
|||
super().__init__(level=level, **kwargs)
|
||||
self.level = level
|
||||
|
||||
def compress(self, data):
|
||||
def compress(self, meta, data):
|
||||
# note: for compatibility no super call, do not add ID bytes
|
||||
return zlib.compress(data, self.level)
|
||||
return None, zlib.compress(data, self.level)
|
||||
|
||||
def decompress(self, data):
|
||||
def decompress(self, meta, data):
|
||||
# note: for compatibility no super call, do not strip ID bytes
|
||||
try:
|
||||
return zlib.decompress(data)
|
||||
return meta, zlib.decompress(data)
|
||||
except zlib.error as e:
|
||||
raise DecompressionError(str(e)) from None
|
||||
|
||||
|
@ -425,7 +456,7 @@ class Auto(CompressorBase):
|
|||
super().__init__()
|
||||
self.compressor = compressor
|
||||
|
||||
def _decide(self, data):
|
||||
def _decide(self, meta, data):
|
||||
"""
|
||||
Decides what to do with *data*. Returns (compressor, compressed_data).
|
||||
|
||||
|
@ -448,33 +479,33 @@ class Auto(CompressorBase):
|
|||
Note: While it makes no sense, the expensive compressor may well be set
|
||||
to the LZ4 compressor.
|
||||
"""
|
||||
compressor, compressed_data = LZ4_COMPRESSOR.decide_compress(data)
|
||||
compressor, (meta, compressed_data) = LZ4_COMPRESSOR.decide_compress(meta, data)
|
||||
# compressed_data includes the compression type header, while data does not yet
|
||||
ratio = len(compressed_data) / (len(data) + 2)
|
||||
if ratio < 0.97:
|
||||
return self.compressor, compressed_data
|
||||
return self.compressor, (meta, compressed_data)
|
||||
else:
|
||||
return compressor, compressed_data
|
||||
return compressor, (meta, compressed_data)
|
||||
|
||||
def decide(self, data):
|
||||
return self._decide(data)[0]
|
||||
def decide(self, meta, data):
|
||||
return self._decide(meta, data)[0]
|
||||
|
||||
def compress(self, data):
|
||||
compressor, cheap_compressed_data = self._decide(data)
|
||||
def compress(self, meta, data):
|
||||
compressor, (cheap_meta, cheap_compressed_data) = self._decide(dict(meta), data)
|
||||
if compressor in (LZ4_COMPRESSOR, NONE_COMPRESSOR):
|
||||
# we know that trying to compress with expensive compressor is likely pointless,
|
||||
# so we fallback to return the cheap compressed data.
|
||||
return cheap_compressed_data
|
||||
return cheap_meta, cheap_compressed_data
|
||||
# if we get here, the decider decided to try the expensive compressor.
|
||||
# we also know that the compressed data returned by the decider is lz4 compressed.
|
||||
expensive_compressed_data = compressor.compress(data)
|
||||
expensive_meta, expensive_compressed_data = compressor.compress(dict(meta), data)
|
||||
ratio = len(expensive_compressed_data) / len(cheap_compressed_data)
|
||||
if ratio < 0.99:
|
||||
# the expensive compressor managed to squeeze the data significantly better than lz4.
|
||||
return expensive_compressed_data
|
||||
return expensive_meta, expensive_compressed_data
|
||||
else:
|
||||
# otherwise let's just store the lz4 data, which decompresses extremely fast.
|
||||
return cheap_compressed_data
|
||||
return cheap_meta, cheap_compressed_data
|
||||
|
||||
def decompress(self, data):
|
||||
raise NotImplementedError
|
||||
|
@ -487,14 +518,14 @@ class ObfuscateSize(CompressorBase):
|
|||
"""
|
||||
Meta-Compressor that obfuscates the compressed data size.
|
||||
"""
|
||||
ID = b'\x04'
|
||||
ID = 0x04
|
||||
name = 'obfuscate'
|
||||
|
||||
header_fmt = Struct('<I')
|
||||
header_len = len(header_fmt.pack(0))
|
||||
|
||||
def __init__(self, level=None, compressor=None):
|
||||
super().__init__(level=level) # data will be encrypted, so we can tell the level
|
||||
def __init__(self, level=None, compressor=None, legacy_mode=False):
|
||||
super().__init__(level=level, legacy_mode=legacy_mode) # data will be encrypted, so we can tell the level
|
||||
self.compressor = compressor
|
||||
if level is None:
|
||||
pass # decompression
|
||||
|
@ -524,25 +555,30 @@ class ObfuscateSize(CompressorBase):
|
|||
def _random_padding_obfuscate(self, compr_size):
|
||||
return int(self.max_padding_size * random.random())
|
||||
|
||||
def compress(self, data):
|
||||
compressed_data = self.compressor.compress(data) # compress data
|
||||
def compress(self, meta, data):
|
||||
assert not self.legacy_mode # we never call this in legacy mode
|
||||
meta = dict(meta) # make a copy, do not modify caller's dict
|
||||
meta, compressed_data = self.compressor.compress(meta, data) # compress data
|
||||
compr_size = len(compressed_data)
|
||||
header = self.header_fmt.pack(compr_size)
|
||||
assert "csize" in meta, repr(meta)
|
||||
meta["psize"] = meta["csize"] # psize (payload size) is the csize (compressed size) of the inner compressor
|
||||
addtl_size = self._obfuscate(compr_size)
|
||||
addtl_size = max(0, addtl_size) # we can only make it longer, not shorter!
|
||||
addtl_size = min(MAX_DATA_SIZE - 1024 - compr_size, addtl_size) # stay away from MAX_DATA_SIZE
|
||||
trailer = bytes(addtl_size)
|
||||
obfuscated_data = b''.join([header, compressed_data, trailer])
|
||||
return super().compress(obfuscated_data) # add ID header
|
||||
obfuscated_data = compressed_data + trailer
|
||||
meta["csize"] = len(obfuscated_data) # csize is the overall output size of this "obfuscation compressor"
|
||||
return meta, obfuscated_data # for borg2 it is enough that we have the payload size in meta["psize"]
|
||||
|
||||
def decompress(self, data):
|
||||
obfuscated_data = super().decompress(data) # remove obfuscator ID header
|
||||
def decompress(self, meta, data):
|
||||
assert self.legacy_mode # borg2 never dispatches to this, only used for legacy mode
|
||||
meta, obfuscated_data = super().decompress(meta, data) # remove obfuscator ID header
|
||||
compr_size = self.header_fmt.unpack(obfuscated_data[0:self.header_len])[0]
|
||||
compressed_data = obfuscated_data[self.header_len:self.header_len+compr_size]
|
||||
if self.compressor is None:
|
||||
compressor_cls = Compressor.detect(compressed_data)[0]
|
||||
self.compressor = compressor_cls()
|
||||
return self.compressor.decompress(compressed_data) # decompress data
|
||||
return self.compressor.decompress(meta, compressed_data) # decompress data
|
||||
|
||||
|
||||
# Maps valid compressor names to their class
|
||||
|
@ -576,12 +612,18 @@ class Compressor:
|
|||
self.params = kwargs
|
||||
self.compressor = get_compressor(name, **self.params)
|
||||
|
||||
def compress(self, data):
|
||||
return self.compressor.compress(data)
|
||||
def compress(self, meta, data):
|
||||
return self.compressor.compress(meta, data)
|
||||
|
||||
def decompress(self, data):
|
||||
compressor_cls = self.detect(data)[0]
|
||||
return compressor_cls(**self.params).decompress(data)
|
||||
def decompress(self, meta, data):
|
||||
if self.compressor.legacy_mode:
|
||||
hdr = data[:2]
|
||||
else:
|
||||
ctype = meta["ctype"]
|
||||
clevel = meta["clevel"]
|
||||
hdr = bytes((ctype, clevel))
|
||||
compressor_cls = self.detect(hdr)[0]
|
||||
return compressor_cls(**self.params).decompress(meta, data)
|
||||
|
||||
@staticmethod
|
||||
def detect(data):
|
||||
|
|
|
@ -12,7 +12,6 @@ logger = create_logger()
|
|||
import argon2.low_level
|
||||
|
||||
from ..constants import * # NOQA
|
||||
from ..compress import Compressor
|
||||
from ..helpers import StableDict
|
||||
from ..helpers import Error, IntegrityError
|
||||
from ..helpers import get_keys_dir, get_security_dir
|
||||
|
@ -23,6 +22,8 @@ from ..helpers import msgpack
|
|||
from ..item import Key, EncryptedKey, want_bytes
|
||||
from ..manifest import Manifest
|
||||
from ..platform import SaveFile
|
||||
from ..repoobj import RepoObj
|
||||
|
||||
|
||||
from .nonces import NonceManager
|
||||
from .low_level import AES, bytes_to_int, num_cipher_blocks, hmac_sha256, blake2b_256, hkdf_hmac_sha512
|
||||
|
@ -107,7 +108,8 @@ def identify_key(manifest_data):
|
|||
raise UnsupportedPayloadError(key_type)
|
||||
|
||||
|
||||
def key_factory(repository, manifest_data):
|
||||
def key_factory(repository, manifest_chunk, *, ro_cls=RepoObj):
|
||||
manifest_data = ro_cls.extract_crypted_data(manifest_chunk)
|
||||
return identify_key(manifest_data).detect(repository, manifest_data)
|
||||
|
||||
|
||||
|
@ -186,10 +188,6 @@ class KeyBase:
|
|||
self.TYPE_STR = bytes([self.TYPE])
|
||||
self.repository = repository
|
||||
self.target = None # key location file path / repo obj
|
||||
# Some commands write new chunks (e.g. rename) but don't take a --compression argument. This duplicates
|
||||
# the default used by those commands who do take a --compression argument.
|
||||
self.compressor = Compressor("lz4")
|
||||
self.decompress = self.compressor.decompress
|
||||
self.tam_required = True
|
||||
self.copy_crypt_key = False
|
||||
|
||||
|
@ -197,10 +195,10 @@ class KeyBase:
|
|||
"""Return HMAC hash using the "id" HMAC key"""
|
||||
raise NotImplementedError
|
||||
|
||||
def encrypt(self, id, data, compress=True):
|
||||
def encrypt(self, id, data):
|
||||
pass
|
||||
|
||||
def decrypt(self, id, data, decompress=True):
|
||||
def decrypt(self, id, data):
|
||||
pass
|
||||
|
||||
def assert_id(self, id, data):
|
||||
|
@ -301,19 +299,12 @@ class PlaintextKey(KeyBase):
|
|||
def id_hash(self, data):
|
||||
return sha256(data).digest()
|
||||
|
||||
def encrypt(self, id, data, compress=True):
|
||||
if compress:
|
||||
data = self.compressor.compress(data)
|
||||
def encrypt(self, id, data):
|
||||
return b"".join([self.TYPE_STR, data])
|
||||
|
||||
def decrypt(self, id, data, decompress=True):
|
||||
def decrypt(self, id, data):
|
||||
self.assert_type(data[0], id)
|
||||
payload = memoryview(data)[1:]
|
||||
if not decompress:
|
||||
return payload
|
||||
data = self.decompress(payload)
|
||||
self.assert_id(id, data)
|
||||
return data
|
||||
return memoryview(data)[1:]
|
||||
|
||||
def _tam_key(self, salt, context):
|
||||
return salt + context
|
||||
|
@ -380,23 +371,16 @@ class AESKeyBase(KeyBase):
|
|||
|
||||
logically_encrypted = True
|
||||
|
||||
def encrypt(self, id, data, compress=True):
|
||||
if compress:
|
||||
data = self.compressor.compress(data)
|
||||
def encrypt(self, id, data):
|
||||
next_iv = self.nonce_manager.ensure_reservation(self.cipher.next_iv(), self.cipher.block_count(len(data)))
|
||||
return self.cipher.encrypt(data, header=self.TYPE_STR, iv=next_iv)
|
||||
|
||||
def decrypt(self, id, data, decompress=True):
|
||||
def decrypt(self, id, data):
|
||||
self.assert_type(data[0], id)
|
||||
try:
|
||||
payload = self.cipher.decrypt(data)
|
||||
return self.cipher.decrypt(data)
|
||||
except IntegrityError as e:
|
||||
raise IntegrityError(f"Chunk {bin_to_hex(id)}: Could not decrypt [{str(e)}]")
|
||||
if not decompress:
|
||||
return payload
|
||||
data = self.decompress(memoryview(payload))
|
||||
self.assert_id(id, data)
|
||||
return data
|
||||
|
||||
def init_from_given_data(self, *, crypt_key, id_key, chunk_seed):
|
||||
assert len(crypt_key) in (32 + 32, 32 + 128)
|
||||
|
@ -804,19 +788,12 @@ class AuthenticatedKeyBase(AESKeyBase, FlexiKey):
|
|||
if manifest_data is not None:
|
||||
self.assert_type(manifest_data[0])
|
||||
|
||||
def encrypt(self, id, data, compress=True):
|
||||
if compress:
|
||||
data = self.compressor.compress(data)
|
||||
def encrypt(self, id, data):
|
||||
return b"".join([self.TYPE_STR, data])
|
||||
|
||||
def decrypt(self, id, data, decompress=True):
|
||||
def decrypt(self, id, data):
|
||||
self.assert_type(data[0], id)
|
||||
payload = memoryview(data)[1:]
|
||||
if not decompress:
|
||||
return payload
|
||||
data = self.decompress(payload)
|
||||
self.assert_id(id, data)
|
||||
return data
|
||||
return memoryview(data)[1:]
|
||||
|
||||
|
||||
class AuthenticatedKey(ID_HMAC_SHA_256, AuthenticatedKeyBase):
|
||||
|
@ -861,10 +838,15 @@ class AEADKeyBase(KeyBase):
|
|||
|
||||
MAX_IV = 2**48 - 1
|
||||
|
||||
def encrypt(self, id, data, compress=True):
|
||||
def assert_id(self, id, data):
|
||||
# note: assert_id(id, data) is not needed any more for the new AEAD crypto.
|
||||
# we put the id into AAD when storing the chunk, so it gets into the authentication tag computation.
|
||||
# when decrypting, we provide the id we **want** as AAD for the auth tag verification, so
|
||||
# decrypting only succeeds if we got the ciphertext we wrote **for that chunk id**.
|
||||
pass
|
||||
|
||||
def encrypt(self, id, data):
|
||||
# to encrypt new data in this session we use always self.cipher and self.sessionid
|
||||
if compress:
|
||||
data = self.compressor.compress(data)
|
||||
reserved = b"\0"
|
||||
iv = self.cipher.next_iv()
|
||||
if iv > self.MAX_IV: # see the data-structures docs about why the IV range is enough
|
||||
|
@ -873,7 +855,7 @@ class AEADKeyBase(KeyBase):
|
|||
header = self.TYPE_STR + reserved + iv_48bit + self.sessionid
|
||||
return self.cipher.encrypt(data, header=header, iv=iv, aad=id)
|
||||
|
||||
def decrypt(self, id, data, decompress=True):
|
||||
def decrypt(self, id, data):
|
||||
# to decrypt existing data, we need to get a cipher configured for the sessionid and iv from header
|
||||
self.assert_type(data[0], id)
|
||||
iv_48bit = data[2:8]
|
||||
|
@ -881,17 +863,9 @@ class AEADKeyBase(KeyBase):
|
|||
iv = int.from_bytes(iv_48bit, "big")
|
||||
cipher = self._get_cipher(sessionid, iv)
|
||||
try:
|
||||
payload = cipher.decrypt(data, aad=id)
|
||||
return cipher.decrypt(data, aad=id)
|
||||
except IntegrityError as e:
|
||||
raise IntegrityError(f"Chunk {bin_to_hex(id)}: Could not decrypt [{str(e)}]")
|
||||
if not decompress:
|
||||
return payload
|
||||
data = self.decompress(memoryview(payload))
|
||||
# note: calling self.assert_id(id, data) is not needed any more for the new AEAD crypto.
|
||||
# we put the id into AAD when storing the chunk, so it gets into the authentication tag computation.
|
||||
# when decrypting, we provide the id we **want** as AAD for the auth tag verification, so
|
||||
# decrypting only succeeds if we got the ciphertext we wrote **for that chunk id**.
|
||||
return data
|
||||
|
||||
def init_from_given_data(self, *, crypt_key, id_key, chunk_seed):
|
||||
assert len(crypt_key) in (32 + 32, 32 + 128)
|
||||
|
|
|
@ -7,6 +7,8 @@ from hashlib import sha256
|
|||
from ..helpers import Error, yes, bin_to_hex, dash_open
|
||||
from ..manifest import Manifest, NoManifestError
|
||||
from ..repository import Repository
|
||||
from ..repoobj import RepoObj
|
||||
|
||||
|
||||
from .key import CHPOKeyfileKey, RepoKeyNotFoundError, KeyBlobStorage, identify_key
|
||||
|
||||
|
@ -40,10 +42,11 @@ class KeyManager:
|
|||
self.keyblob_storage = None
|
||||
|
||||
try:
|
||||
manifest_data = self.repository.get(Manifest.MANIFEST_ID)
|
||||
manifest_chunk = self.repository.get(Manifest.MANIFEST_ID)
|
||||
except Repository.ObjectNotFound:
|
||||
raise NoManifestError
|
||||
|
||||
manifest_data = RepoObj.extract_crypted_data(manifest_chunk)
|
||||
key = identify_key(manifest_data)
|
||||
self.keyblob_storage = key.STORAGE
|
||||
if self.keyblob_storage == KeyBlobStorage.NO_STORAGE:
|
||||
|
|
|
@ -241,12 +241,12 @@ class ItemCache:
|
|||
class FuseBackend:
|
||||
"""Virtual filesystem based on archive(s) to provide information to fuse"""
|
||||
|
||||
def __init__(self, key, manifest, repository, args, decrypted_repository):
|
||||
self.repository_uncached = repository
|
||||
def __init__(self, manifest, args, decrypted_repository):
|
||||
self._args = args
|
||||
self.numeric_ids = args.numeric_ids
|
||||
self._manifest = manifest
|
||||
self.key = key
|
||||
self.repo_objs = manifest.repo_objs
|
||||
self.repository_uncached = manifest.repository
|
||||
# Maps inode numbers to Item instances. This is used for synthetic inodes, i.e. file-system objects that are
|
||||
# made up and are not contained in the archives. For example archive directories or intermediate directories
|
||||
# not contained in archives.
|
||||
|
@ -330,13 +330,7 @@ class FuseBackend:
|
|||
"""Build FUSE inode hierarchy from archive metadata"""
|
||||
self.file_versions = {} # for versions mode: original path -> version
|
||||
t0 = time.perf_counter()
|
||||
archive = Archive(
|
||||
self.repository_uncached,
|
||||
self.key,
|
||||
self._manifest,
|
||||
archive_name,
|
||||
consider_part_files=self._args.consider_part_files,
|
||||
)
|
||||
archive = Archive(self._manifest, archive_name, consider_part_files=self._args.consider_part_files)
|
||||
strip_components = self._args.strip_components
|
||||
matcher = build_matcher(self._args.patterns, self._args.paths)
|
||||
hlm = HardLinkManager(id_type=bytes, info_type=str) # hlid -> path
|
||||
|
@ -447,9 +441,9 @@ class FuseBackend:
|
|||
class FuseOperations(llfuse.Operations, FuseBackend):
|
||||
"""Export archive as a FUSE filesystem"""
|
||||
|
||||
def __init__(self, key, repository, manifest, args, decrypted_repository):
|
||||
def __init__(self, manifest, args, decrypted_repository):
|
||||
llfuse.Operations.__init__(self)
|
||||
FuseBackend.__init__(self, key, manifest, repository, args, decrypted_repository)
|
||||
FuseBackend.__init__(self, manifest, args, decrypted_repository)
|
||||
self.decrypted_repository = decrypted_repository
|
||||
data_cache_capacity = int(os.environ.get("BORG_MOUNT_DATA_CACHE_ENTRIES", os.cpu_count() or 1))
|
||||
logger.debug("mount data cache capacity: %d chunks", data_cache_capacity)
|
||||
|
@ -688,7 +682,7 @@ class FuseOperations(llfuse.Operations, FuseBackend):
|
|||
# evict fully read chunk from cache
|
||||
del self.data_cache[id]
|
||||
else:
|
||||
data = self.key.decrypt(id, self.repository_uncached.get(id))
|
||||
_, data = self.repo_objs.parse(id, self.repository_uncached.get(id))
|
||||
if offset + n < len(data):
|
||||
# chunk was only partially read, cache it
|
||||
self.data_cache[id] = data
|
||||
|
|
|
@ -673,7 +673,7 @@ class ArchiveFormatter(BaseFormatter):
|
|||
if self._archive is None or self._archive.id != self.id:
|
||||
from ..archive import Archive
|
||||
|
||||
self._archive = Archive(self.repository, self.key, self.manifest, self.name, iec=self.iec)
|
||||
self._archive = Archive(self.manifest, self.name, iec=self.iec)
|
||||
return self._archive
|
||||
|
||||
def get_meta(self, key, rs):
|
||||
|
|
|
@ -17,6 +17,7 @@ from .helpers.datastruct import StableDict
|
|||
from .helpers.parseformat import bin_to_hex
|
||||
from .helpers.time import parse_timestamp
|
||||
from .helpers.errors import Error
|
||||
from .repoobj import RepoObj
|
||||
|
||||
|
||||
class NoManifestError(Error):
|
||||
|
@ -164,10 +165,11 @@ class Manifest:
|
|||
|
||||
MANIFEST_ID = b"\0" * 32
|
||||
|
||||
def __init__(self, key, repository, item_keys=None):
|
||||
def __init__(self, key, repository, item_keys=None, ro_cls=RepoObj):
|
||||
self.archives = Archives()
|
||||
self.config = {}
|
||||
self.key = key
|
||||
self.repo_objs = ro_cls(key)
|
||||
self.repository = repository
|
||||
self.item_keys = frozenset(item_keys) if item_keys is not None else ITEM_KEYS
|
||||
self.tam_verified = False
|
||||
|
@ -182,7 +184,7 @@ class Manifest:
|
|||
return parse_timestamp(self.timestamp)
|
||||
|
||||
@classmethod
|
||||
def load(cls, repository, operations, key=None, force_tam_not_required=False):
|
||||
def load(cls, repository, operations, key=None, force_tam_not_required=False, *, ro_cls=RepoObj):
|
||||
from .item import ManifestItem
|
||||
from .crypto.key import key_factory, tam_required_file, tam_required
|
||||
from .repository import Repository
|
||||
|
@ -192,14 +194,14 @@ class Manifest:
|
|||
except Repository.ObjectNotFound:
|
||||
raise NoManifestError
|
||||
if not key:
|
||||
key = key_factory(repository, cdata)
|
||||
manifest = cls(key, repository)
|
||||
data = key.decrypt(cls.MANIFEST_ID, cdata)
|
||||
key = key_factory(repository, cdata, ro_cls=ro_cls)
|
||||
manifest = cls(key, repository, ro_cls=ro_cls)
|
||||
_, data = manifest.repo_objs.parse(cls.MANIFEST_ID, cdata)
|
||||
manifest_dict, manifest.tam_verified = key.unpack_and_verify_manifest(
|
||||
data, force_tam_not_required=force_tam_not_required
|
||||
)
|
||||
m = ManifestItem(internal_dict=manifest_dict)
|
||||
manifest.id = key.id_hash(data)
|
||||
manifest.id = manifest.repo_objs.id_hash(data)
|
||||
if m.get("version") not in (1, 2):
|
||||
raise ValueError("Invalid manifest version")
|
||||
manifest.archives.set_raw_dict(m.archives)
|
||||
|
@ -219,7 +221,7 @@ class Manifest:
|
|||
logger.debug("Manifest is TAM verified and says TAM is *not* required, updating security database...")
|
||||
os.unlink(tam_required_file(repository))
|
||||
manifest.check_repository_compatibility(operations)
|
||||
return manifest, key
|
||||
return manifest
|
||||
|
||||
def check_repository_compatibility(self, operations):
|
||||
for operation in operations:
|
||||
|
@ -272,5 +274,5 @@ class Manifest:
|
|||
)
|
||||
self.tam_verified = True
|
||||
data = self.key.pack_and_authenticate_metadata(manifest.as_dict())
|
||||
self.id = self.key.id_hash(data)
|
||||
self.repository.put(self.MANIFEST_ID, self.key.encrypt(self.MANIFEST_ID, data))
|
||||
self.id = self.repo_objs.id_hash(data)
|
||||
self.repository.put(self.MANIFEST_ID, self.repo_objs.format(self.MANIFEST_ID, {}, data))
|
||||
|
|
|
@ -1001,12 +1001,12 @@ This problem will go away as soon as the server has been upgraded to 1.0.7+.
|
|||
def flags_many(self, ids, mask=0xFFFFFFFF, value=None):
|
||||
"""actual remoting is done via self.call in the @api decorator"""
|
||||
|
||||
def get(self, id):
|
||||
for resp in self.get_many([id]):
|
||||
def get(self, id, read_data=True):
|
||||
for resp in self.get_many([id], read_data=read_data):
|
||||
return resp
|
||||
|
||||
def get_many(self, ids, is_preloaded=False):
|
||||
yield from self.call_many("get", [{"id": id} for id in ids], is_preloaded=is_preloaded)
|
||||
def get_many(self, ids, read_data=True, is_preloaded=False):
|
||||
yield from self.call_many("get", [{"id": id, "read_data": read_data} for id in ids], is_preloaded=is_preloaded)
|
||||
|
||||
@api(since=parse_version("1.0.0"))
|
||||
def put(self, id, data, wait=True):
|
||||
|
@ -1148,11 +1148,11 @@ class RepositoryNoCache:
|
|||
def __exit__(self, exc_type, exc_val, exc_tb):
|
||||
self.close()
|
||||
|
||||
def get(self, key):
|
||||
return next(self.get_many([key], cache=False))
|
||||
def get(self, key, read_data=True):
|
||||
return next(self.get_many([key], read_data=read_data, cache=False))
|
||||
|
||||
def get_many(self, keys, cache=True):
|
||||
for key, data in zip(keys, self.repository.get_many(keys)):
|
||||
def get_many(self, keys, read_data=True, cache=True):
|
||||
for key, data in zip(keys, self.repository.get_many(keys, read_data=read_data)):
|
||||
yield self.transform(key, data)
|
||||
|
||||
def log_instrumentation(self):
|
||||
|
@ -1191,6 +1191,12 @@ class RepositoryCache(RepositoryNoCache):
|
|||
available_space = shutil.disk_usage(self.basedir).free
|
||||
self.size_limit = int(min(available_space * 0.25, 2**31))
|
||||
|
||||
def prefixed_key(self, key, complete):
|
||||
# just prefix another byte telling whether this key refers to a complete chunk
|
||||
# or a without-data-metadata-only chunk (see also read_data param).
|
||||
prefix = b"\x01" if complete else b"\x00"
|
||||
return prefix + key
|
||||
|
||||
def key_filename(self, key):
|
||||
return os.path.join(self.basedir, bin_to_hex(key))
|
||||
|
||||
|
@ -1204,12 +1210,13 @@ class RepositoryCache(RepositoryNoCache):
|
|||
os.unlink(file)
|
||||
self.evictions += 1
|
||||
|
||||
def add_entry(self, key, data, cache):
|
||||
def add_entry(self, key, data, cache, complete):
|
||||
transformed = self.transform(key, data)
|
||||
if not cache:
|
||||
return transformed
|
||||
packed = self.pack(transformed)
|
||||
file = self.key_filename(key)
|
||||
pkey = self.prefixed_key(key, complete=complete)
|
||||
file = self.key_filename(pkey)
|
||||
try:
|
||||
with open(file, "wb") as fd:
|
||||
fd.write(packed)
|
||||
|
@ -1225,7 +1232,7 @@ class RepositoryCache(RepositoryNoCache):
|
|||
raise
|
||||
else:
|
||||
self.size += len(packed)
|
||||
self.cache.add(key)
|
||||
self.cache.add(pkey)
|
||||
if self.size > self.size_limit:
|
||||
self.backoff()
|
||||
return transformed
|
||||
|
@ -1250,28 +1257,30 @@ class RepositoryCache(RepositoryNoCache):
|
|||
self.cache.clear()
|
||||
shutil.rmtree(self.basedir)
|
||||
|
||||
def get_many(self, keys, cache=True):
|
||||
unknown_keys = [key for key in keys if key not in self.cache]
|
||||
repository_iterator = zip(unknown_keys, self.repository.get_many(unknown_keys))
|
||||
def get_many(self, keys, read_data=True, cache=True):
|
||||
# It could use different cache keys depending on read_data and cache full vs. meta-only chunks.
|
||||
unknown_keys = [key for key in keys if self.prefixed_key(key, complete=read_data) not in self.cache]
|
||||
repository_iterator = zip(unknown_keys, self.repository.get_many(unknown_keys, read_data=read_data))
|
||||
for key in keys:
|
||||
if key in self.cache:
|
||||
file = self.key_filename(key)
|
||||
pkey = self.prefixed_key(key, complete=read_data)
|
||||
if pkey in self.cache:
|
||||
file = self.key_filename(pkey)
|
||||
with open(file, "rb") as fd:
|
||||
self.hits += 1
|
||||
yield self.unpack(fd.read())
|
||||
else:
|
||||
for key_, data in repository_iterator:
|
||||
if key_ == key:
|
||||
transformed = self.add_entry(key, data, cache)
|
||||
transformed = self.add_entry(key, data, cache, complete=read_data)
|
||||
self.misses += 1
|
||||
yield transformed
|
||||
break
|
||||
else:
|
||||
# slow path: eviction during this get_many removed this key from the cache
|
||||
t0 = time.perf_counter()
|
||||
data = self.repository.get(key)
|
||||
data = self.repository.get(key, read_data=read_data)
|
||||
self.slow_lat += time.perf_counter() - t0
|
||||
transformed = self.add_entry(key, data, cache)
|
||||
transformed = self.add_entry(key, data, cache, complete=read_data)
|
||||
self.slow_misses += 1
|
||||
yield transformed
|
||||
# Consume any pending requests
|
||||
|
@ -1283,7 +1292,7 @@ def cache_if_remote(repository, *, decrypted_cache=False, pack=None, unpack=None
|
|||
"""
|
||||
Return a Repository(No)Cache for *repository*.
|
||||
|
||||
If *decrypted_cache* is a key object, then get and get_many will return a tuple
|
||||
If *decrypted_cache* is a repo_objs object, then get and get_many will return a tuple
|
||||
(csize, plaintext) instead of the actual data in the repository. The cache will
|
||||
store decrypted data, which increases CPU efficiency (by avoiding repeatedly decrypting
|
||||
and more importantly MAC and ID checking cached objects).
|
||||
|
@ -1292,27 +1301,29 @@ def cache_if_remote(repository, *, decrypted_cache=False, pack=None, unpack=None
|
|||
if decrypted_cache and (pack or unpack or transform):
|
||||
raise ValueError("decrypted_cache and pack/unpack/transform are incompatible")
|
||||
elif decrypted_cache:
|
||||
key = decrypted_cache
|
||||
# 32 bit csize, 64 bit (8 byte) xxh64
|
||||
cache_struct = struct.Struct("=I8s")
|
||||
repo_objs = decrypted_cache
|
||||
# 32 bit csize, 64 bit (8 byte) xxh64, 1 byte ctype, 1 byte clevel
|
||||
cache_struct = struct.Struct("=I8sBB")
|
||||
compressor = Compressor("lz4")
|
||||
|
||||
def pack(data):
|
||||
csize, decrypted = data
|
||||
compressed = compressor.compress(decrypted)
|
||||
return cache_struct.pack(csize, xxh64(compressed)) + compressed
|
||||
meta, compressed = compressor.compress({}, decrypted)
|
||||
return cache_struct.pack(csize, xxh64(compressed), meta["ctype"], meta["clevel"]) + compressed
|
||||
|
||||
def unpack(data):
|
||||
data = memoryview(data)
|
||||
csize, checksum = cache_struct.unpack(data[: cache_struct.size])
|
||||
csize, checksum, ctype, clevel = cache_struct.unpack(data[: cache_struct.size])
|
||||
compressed = data[cache_struct.size :]
|
||||
if checksum != xxh64(compressed):
|
||||
raise IntegrityError("detected corrupted data in metadata cache")
|
||||
return csize, compressor.decompress(compressed)
|
||||
meta = dict(ctype=ctype, clevel=clevel, csize=len(compressed))
|
||||
_, decrypted = compressor.decompress(meta, compressed)
|
||||
return csize, decrypted
|
||||
|
||||
def transform(id_, data):
|
||||
csize = len(data)
|
||||
decrypted = key.decrypt(id_, data)
|
||||
meta, decrypted = repo_objs.parse(id_, data)
|
||||
csize = meta.get("csize", len(data))
|
||||
return csize, decrypted
|
||||
|
||||
if isinstance(repository, RemoteRepository) or force_cache:
|
||||
|
|
|
@ -0,0 +1,148 @@
|
|||
from struct import Struct
|
||||
|
||||
from .helpers import msgpack
|
||||
from .compress import Compressor, LZ4_COMPRESSOR, get_compressor
|
||||
|
||||
|
||||
class RepoObj:
|
||||
meta_len_hdr = Struct("<H") # 16bit unsigned int
|
||||
|
||||
@classmethod
|
||||
def extract_crypted_data(cls, data: bytes) -> bytes:
|
||||
# used for crypto type detection
|
||||
offs = cls.meta_len_hdr.size
|
||||
meta_len = cls.meta_len_hdr.unpack(data[:offs])[0]
|
||||
return data[offs + meta_len :]
|
||||
|
||||
def __init__(self, key):
|
||||
self.key = key
|
||||
# Some commands write new chunks (e.g. rename) but don't take a --compression argument. This duplicates
|
||||
# the default used by those commands who do take a --compression argument.
|
||||
self.compressor = LZ4_COMPRESSOR
|
||||
|
||||
def id_hash(self, data: bytes) -> bytes:
|
||||
return self.key.id_hash(data)
|
||||
|
||||
def format(
|
||||
self,
|
||||
id: bytes,
|
||||
meta: dict,
|
||||
data: bytes,
|
||||
compress: bool = True,
|
||||
size: int = None,
|
||||
ctype: int = None,
|
||||
clevel: int = None,
|
||||
) -> bytes:
|
||||
assert isinstance(id, bytes)
|
||||
assert isinstance(meta, dict)
|
||||
meta = dict(meta) # make a copy, so call arg is not modified
|
||||
assert isinstance(data, (bytes, memoryview))
|
||||
assert compress or size is not None and ctype is not None and clevel is not None
|
||||
if compress:
|
||||
assert size is None or size == len(data)
|
||||
meta, data_compressed = self.compressor.compress(meta, data)
|
||||
else:
|
||||
assert isinstance(size, int)
|
||||
meta["size"] = size
|
||||
assert isinstance(ctype, int)
|
||||
meta["ctype"] = ctype
|
||||
assert isinstance(clevel, int)
|
||||
meta["clevel"] = clevel
|
||||
data_compressed = data # is already compressed, is NOT prefixed by type/level bytes
|
||||
meta["csize"] = len(data_compressed)
|
||||
data_encrypted = self.key.encrypt(id, data_compressed)
|
||||
meta_packed = msgpack.packb(meta)
|
||||
meta_encrypted = self.key.encrypt(id, meta_packed)
|
||||
hdr = self.meta_len_hdr.pack(len(meta_encrypted))
|
||||
return hdr + meta_encrypted + data_encrypted
|
||||
|
||||
def parse_meta(self, id: bytes, cdata: bytes) -> dict:
|
||||
# when calling parse_meta, enough cdata needs to be supplied to completely contain the
|
||||
# meta_len_hdr and the encrypted, packed metadata. it is allowed to provide more cdata.
|
||||
assert isinstance(id, bytes)
|
||||
assert isinstance(cdata, bytes)
|
||||
obj = memoryview(cdata)
|
||||
offs = self.meta_len_hdr.size
|
||||
hdr = obj[:offs]
|
||||
len_meta_encrypted = self.meta_len_hdr.unpack(hdr)[0]
|
||||
assert offs + len_meta_encrypted <= len(obj)
|
||||
meta_encrypted = obj[offs : offs + len_meta_encrypted]
|
||||
meta_packed = self.key.decrypt(id, meta_encrypted)
|
||||
meta = msgpack.unpackb(meta_packed)
|
||||
return meta
|
||||
|
||||
def parse(self, id: bytes, cdata: bytes, decompress: bool = True) -> tuple[dict, bytes]:
|
||||
assert isinstance(id, bytes)
|
||||
assert isinstance(cdata, bytes)
|
||||
obj = memoryview(cdata)
|
||||
offs = self.meta_len_hdr.size
|
||||
hdr = obj[:offs]
|
||||
len_meta_encrypted = self.meta_len_hdr.unpack(hdr)[0]
|
||||
assert offs + len_meta_encrypted <= len(obj)
|
||||
meta_encrypted = obj[offs : offs + len_meta_encrypted]
|
||||
offs += len_meta_encrypted
|
||||
meta_packed = self.key.decrypt(id, meta_encrypted)
|
||||
meta = msgpack.unpackb(meta_packed)
|
||||
data_encrypted = obj[offs:]
|
||||
data_compressed = self.key.decrypt(id, data_encrypted)
|
||||
if decompress:
|
||||
ctype = meta["ctype"]
|
||||
clevel = meta["clevel"]
|
||||
csize = meta["csize"] # always the overall size
|
||||
assert csize == len(data_compressed)
|
||||
psize = meta.get("psize", csize) # obfuscation: psize (payload size) is potentially less than csize.
|
||||
assert psize <= csize
|
||||
compr_hdr = bytes((ctype, clevel))
|
||||
compressor_cls, compression_level = Compressor.detect(compr_hdr)
|
||||
compressor = compressor_cls(level=compression_level)
|
||||
meta, data = compressor.decompress(meta, data_compressed[:psize])
|
||||
self.key.assert_id(id, data)
|
||||
else:
|
||||
data = data_compressed # does not include the type/level bytes
|
||||
return meta, data
|
||||
|
||||
|
||||
class RepoObj1: # legacy
|
||||
@classmethod
|
||||
def extract_crypted_data(cls, data: bytes) -> bytes:
|
||||
# used for crypto type detection
|
||||
return data
|
||||
|
||||
def __init__(self, key):
|
||||
self.key = key
|
||||
self.compressor = get_compressor("lz4", legacy_mode=True)
|
||||
|
||||
def id_hash(self, data: bytes) -> bytes:
|
||||
return self.key.id_hash(data)
|
||||
|
||||
def format(self, id: bytes, meta: dict, data: bytes, compress: bool = True, size: int = None) -> bytes:
|
||||
assert isinstance(id, bytes)
|
||||
assert meta == {}
|
||||
assert isinstance(data, (bytes, memoryview))
|
||||
assert compress or size is not None
|
||||
assert compress or size is not None
|
||||
if compress:
|
||||
assert size is None
|
||||
meta, data_compressed = self.compressor.compress(meta, data)
|
||||
else:
|
||||
assert isinstance(size, int)
|
||||
data_compressed = data # is already compressed, must include type/level bytes
|
||||
data_encrypted = self.key.encrypt(id, data_compressed)
|
||||
return data_encrypted
|
||||
|
||||
def parse(self, id: bytes, cdata: bytes, decompress: bool = True) -> tuple[dict, bytes]:
|
||||
assert isinstance(id, bytes)
|
||||
assert isinstance(cdata, bytes)
|
||||
data_compressed = self.key.decrypt(id, cdata)
|
||||
compressor_cls, compression_level = Compressor.detect(data_compressed[:2])
|
||||
compressor = compressor_cls(level=compression_level, legacy_mode=True)
|
||||
if decompress:
|
||||
meta, data = compressor.decompress(None, data_compressed)
|
||||
self.key.assert_id(id, data)
|
||||
else:
|
||||
meta = {}
|
||||
meta["ctype"] = compressor.ID
|
||||
meta["clevel"] = compressor.level
|
||||
data = data_compressed
|
||||
meta["csize"] = len(data_compressed)
|
||||
return meta, data
|
|
@ -25,6 +25,7 @@ from .locking import Lock, LockError, LockErrorT
|
|||
from .logger import create_logger
|
||||
from .manifest import Manifest
|
||||
from .platform import SaveFile, SyncFile, sync_dir, safe_fadvise
|
||||
from .repoobj import RepoObj
|
||||
from .checksums import crc32, StreamingXXH64
|
||||
from .crypto.file_integrity import IntegrityCheckedFile, FileIntegrityError
|
||||
|
||||
|
@ -830,7 +831,7 @@ class Repository:
|
|||
freeable_ratio * 100.0,
|
||||
freeable_space,
|
||||
)
|
||||
for tag, key, offset, data in self.io.iter_objects(segment, include_data=True):
|
||||
for tag, key, offset, _, data in self.io.iter_objects(segment):
|
||||
if tag == TAG_COMMIT:
|
||||
continue
|
||||
in_index = self.index.get(key)
|
||||
|
@ -961,7 +962,7 @@ class Repository:
|
|||
def _update_index(self, segment, objects, report=None):
|
||||
"""some code shared between replay_segments and check"""
|
||||
self.segments[segment] = 0
|
||||
for tag, key, offset, size in objects:
|
||||
for tag, key, offset, size, _ in objects:
|
||||
if tag in (TAG_PUT2, TAG_PUT):
|
||||
try:
|
||||
# If this PUT supersedes an older PUT, mark the old segment for compaction and count the free space
|
||||
|
@ -1011,7 +1012,7 @@ class Repository:
|
|||
return
|
||||
|
||||
self.compact[segment] = 0
|
||||
for tag, key, offset, size in self.io.iter_objects(segment, read_data=False):
|
||||
for tag, key, offset, size, _ in self.io.iter_objects(segment, read_data=False):
|
||||
if tag in (TAG_PUT2, TAG_PUT):
|
||||
in_index = self.index.get(key)
|
||||
if not in_index or (in_index.segment, in_index.offset) != (segment, offset):
|
||||
|
@ -1165,8 +1166,8 @@ class Repository:
|
|||
if segment is not None and current_segment > segment:
|
||||
break
|
||||
try:
|
||||
for tag, key, current_offset, data in self.io.iter_objects(
|
||||
segment=current_segment, offset=offset or 0, include_data=True
|
||||
for tag, key, current_offset, _, data in self.io.iter_objects(
|
||||
segment=current_segment, offset=offset or 0
|
||||
):
|
||||
if offset is not None and current_offset > offset:
|
||||
break
|
||||
|
@ -1229,10 +1230,10 @@ class Repository:
|
|||
start_segment, start_offset, _ = (0, 0, 0) if at_start else self.index[marker]
|
||||
result = []
|
||||
for segment, filename in self.io.segment_iterator(start_segment):
|
||||
obj_iterator = self.io.iter_objects(segment, start_offset, read_data=False, include_data=False)
|
||||
obj_iterator = self.io.iter_objects(segment, start_offset, read_data=False)
|
||||
while True:
|
||||
try:
|
||||
tag, id, offset, size = next(obj_iterator)
|
||||
tag, id, offset, size, _ = next(obj_iterator)
|
||||
except (StopIteration, IntegrityError):
|
||||
# either end-of-segment or an error - we can not seek to objects at
|
||||
# higher offsets than one that has an error in the header fields.
|
||||
|
@ -1268,18 +1269,18 @@ class Repository:
|
|||
def flags_many(self, ids, mask=0xFFFFFFFF, value=None):
|
||||
return [self.flags(id_, mask, value) for id_ in ids]
|
||||
|
||||
def get(self, id):
|
||||
def get(self, id, read_data=True):
|
||||
if not self.index:
|
||||
self.index = self.open_index(self.get_transaction_id())
|
||||
try:
|
||||
in_index = NSIndexEntry(*((self.index[id] + (None,))[:3])) # legacy: index entries have no size element
|
||||
return self.io.read(in_index.segment, in_index.offset, id, expected_size=in_index.size)
|
||||
return self.io.read(in_index.segment, in_index.offset, id, expected_size=in_index.size, read_data=read_data)
|
||||
except KeyError:
|
||||
raise self.ObjectNotFound(id, self.path) from None
|
||||
|
||||
def get_many(self, ids, is_preloaded=False):
|
||||
def get_many(self, ids, read_data=True, is_preloaded=False):
|
||||
for id_ in ids:
|
||||
yield self.get(id_)
|
||||
yield self.get(id_, read_data=read_data)
|
||||
|
||||
def put(self, id, data, wait=True):
|
||||
"""put a repo object
|
||||
|
@ -1458,7 +1459,7 @@ class LoggedIO:
|
|||
seen_commit = False
|
||||
while True:
|
||||
try:
|
||||
tag, key, offset, _ = next(iterator)
|
||||
tag, key, offset, _, _ = next(iterator)
|
||||
except IntegrityError:
|
||||
return False
|
||||
except StopIteration:
|
||||
|
@ -1560,15 +1561,13 @@ class LoggedIO:
|
|||
fd.seek(0)
|
||||
return fd.read(MAGIC_LEN)
|
||||
|
||||
def iter_objects(self, segment, offset=0, include_data=False, read_data=True):
|
||||
def iter_objects(self, segment, offset=0, read_data=True):
|
||||
"""
|
||||
Return object iterator for *segment*.
|
||||
|
||||
If read_data is False then include_data must be False as well.
|
||||
|
||||
See the _read() docstring about confidence in the returned data.
|
||||
|
||||
The iterator returns four-tuples of (tag, key, offset, data|size).
|
||||
The iterator returns five-tuples of (tag, key, offset, size, data).
|
||||
"""
|
||||
fd = self.get_fd(segment)
|
||||
fd.seek(offset)
|
||||
|
@ -1584,10 +1583,9 @@ class LoggedIO:
|
|||
size, tag, key, data = self._read(
|
||||
fd, header, segment, offset, (TAG_PUT2, TAG_DELETE, TAG_COMMIT, TAG_PUT), read_data=read_data
|
||||
)
|
||||
if include_data:
|
||||
yield tag, key, offset, data
|
||||
else:
|
||||
yield tag, key, offset, size - header_size(tag) # corresponds to len(data)
|
||||
# tuple[3]: corresponds to len(data) == length of the full chunk payload (meta_len+enc_meta+enc_data)
|
||||
# tuple[4]: data will be None if read_data is False.
|
||||
yield tag, key, offset, size - header_size(tag), data
|
||||
assert size >= 0
|
||||
offset += size
|
||||
# we must get the fd via get_fd() here again as we yielded to our caller and it might
|
||||
|
@ -1656,10 +1654,9 @@ class LoggedIO:
|
|||
h.update(d)
|
||||
return h.digest()
|
||||
|
||||
def read(self, segment, offset, id, read_data=True, *, expected_size=None):
|
||||
def read(self, segment, offset, id, *, read_data=True, expected_size=None):
|
||||
"""
|
||||
Read entry from *segment* at *offset* with *id*.
|
||||
If read_data is False the size of the entry is returned instead.
|
||||
|
||||
See the _read() docstring about confidence in the returned data.
|
||||
"""
|
||||
|
@ -1668,7 +1665,7 @@ class LoggedIO:
|
|||
fd = self.get_fd(segment)
|
||||
fd.seek(offset)
|
||||
header = fd.read(self.header_fmt.size)
|
||||
size, tag, key, data = self._read(fd, header, segment, offset, (TAG_PUT2, TAG_PUT), read_data)
|
||||
size, tag, key, data = self._read(fd, header, segment, offset, (TAG_PUT2, TAG_PUT), read_data=read_data)
|
||||
if id != key:
|
||||
raise IntegrityError(
|
||||
"Invalid segment entry header, is not for wanted id [segment {}, offset {}]".format(segment, offset)
|
||||
|
@ -1678,7 +1675,7 @@ class LoggedIO:
|
|||
raise IntegrityError(
|
||||
f"size from repository index: {expected_size} != " f"size from entry header: {data_size_from_header}"
|
||||
)
|
||||
return data if read_data else data_size_from_header
|
||||
return data
|
||||
|
||||
def _read(self, fd, header, segment, offset, acceptable_tags, read_data=True):
|
||||
"""
|
||||
|
@ -1689,6 +1686,11 @@ class LoggedIO:
|
|||
PUT2 tags, read_data == False: crc32 check (header)
|
||||
PUT tags, read_data == True: crc32 check (header+data)
|
||||
PUT tags, read_data == False: crc32 check can not be done, all data obtained must be considered informational
|
||||
|
||||
read_data == False behaviour:
|
||||
PUT2 tags: return enough of the chunk so that the client is able to decrypt the metadata,
|
||||
do not read, but just seek over the data.
|
||||
PUT tags: return None and just seek over the data.
|
||||
"""
|
||||
|
||||
def check_crc32(wanted, header, *data):
|
||||
|
@ -1749,7 +1751,31 @@ class LoggedIO:
|
|||
f"expected {self.ENTRY_HASH_SIZE}, got {len(entry_hash)} bytes"
|
||||
)
|
||||
check_crc32(crc, header, key, entry_hash)
|
||||
if not read_data: # seek over data
|
||||
if not read_data:
|
||||
if tag == TAG_PUT2:
|
||||
# PUT2 is only used in new repos and they also have different RepoObj layout,
|
||||
# supporting separately encrypted metadata and data.
|
||||
# In this case, we return enough bytes so the client can decrypt the metadata
|
||||
# and seek over the rest (over the encrypted data).
|
||||
meta_len_size = RepoObj.meta_len_hdr.size
|
||||
meta_len = fd.read(meta_len_size)
|
||||
length -= meta_len_size
|
||||
if len(meta_len) != meta_len_size:
|
||||
raise IntegrityError(
|
||||
f"Segment entry meta length short read [segment {segment}, offset {offset}]: "
|
||||
f"expected {meta_len_size}, got {len(meta_len)} bytes"
|
||||
)
|
||||
ml = RepoObj.meta_len_hdr.unpack(meta_len)[0]
|
||||
meta = fd.read(ml)
|
||||
length -= ml
|
||||
if len(meta) != ml:
|
||||
raise IntegrityError(
|
||||
f"Segment entry meta short read [segment {segment}, offset {offset}]: "
|
||||
f"expected {ml}, got {len(meta)} bytes"
|
||||
)
|
||||
data = meta_len + meta # shortened chunk - enough so the client can decrypt the metadata
|
||||
# we do not have a checksum for this data, but the client's AEAD crypto will check it.
|
||||
# in any case, we see over the remainder of the chunk
|
||||
oldpos = fd.tell()
|
||||
seeked = fd.seek(length, os.SEEK_CUR) - oldpos
|
||||
if seeked != length:
|
||||
|
|
|
@ -101,17 +101,17 @@ class MockCache:
|
|||
self.objects = {}
|
||||
self.repository = self.MockRepo()
|
||||
|
||||
def add_chunk(self, id, chunk, stats=None, wait=True):
|
||||
self.objects[id] = chunk
|
||||
return id, len(chunk)
|
||||
def add_chunk(self, id, meta, data, stats=None, wait=True):
|
||||
self.objects[id] = data
|
||||
return id, len(data)
|
||||
|
||||
|
||||
class ArchiveTimestampTestCase(BaseTestCase):
|
||||
def _test_timestamp_parsing(self, isoformat, expected):
|
||||
repository = Mock()
|
||||
key = PlaintextKey(repository)
|
||||
manifest = Manifest(repository, key)
|
||||
a = Archive(repository, key, manifest, "test", create=True)
|
||||
manifest = Manifest(key, repository)
|
||||
a = Archive(manifest, "test", create=True)
|
||||
a.metadata = ArchiveItem(time=isoformat)
|
||||
self.assert_equal(a.ts, expected)
|
||||
|
||||
|
|
|
@ -314,8 +314,8 @@ class ArchiverTestCaseBase(BaseTestCase):
|
|||
def open_archive(self, name):
|
||||
repository = Repository(self.repository_path, exclusive=True)
|
||||
with repository:
|
||||
manifest, key = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
archive = Archive(repository, key, manifest, name)
|
||||
manifest = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
archive = Archive(manifest, name)
|
||||
return archive, repository
|
||||
|
||||
def open_repository(self):
|
||||
|
@ -1660,7 +1660,7 @@ class ArchiverTestCase(ArchiverTestCaseBase):
|
|||
self.cmd(f"--repo={self.repository_location}", "extract", "test.4", "--dry-run")
|
||||
# Make sure both archives have been renamed
|
||||
with Repository(self.repository_path) as repository:
|
||||
manifest, key = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
manifest = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
self.assert_equal(len(manifest.archives), 2)
|
||||
self.assert_in("test.3", manifest.archives)
|
||||
self.assert_in("test.4", manifest.archives)
|
||||
|
@ -1784,8 +1784,8 @@ class ArchiverTestCase(ArchiverTestCaseBase):
|
|||
self.cmd(f"--repo={self.repository_location}", "rcreate", "--encryption=none")
|
||||
self.create_src_archive("test")
|
||||
with Repository(self.repository_path, exclusive=True) as repository:
|
||||
manifest, key = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
archive = Archive(repository, key, manifest, "test")
|
||||
manifest = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
archive = Archive(manifest, "test")
|
||||
for item in archive.iter_items():
|
||||
if item.path.endswith("testsuite/archiver.py"):
|
||||
repository.delete(item.chunks[-1].id)
|
||||
|
@ -1803,8 +1803,8 @@ class ArchiverTestCase(ArchiverTestCaseBase):
|
|||
self.cmd(f"--repo={self.repository_location}", "rcreate", "--encryption=none")
|
||||
self.create_src_archive("test")
|
||||
with Repository(self.repository_path, exclusive=True) as repository:
|
||||
manifest, key = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
archive = Archive(repository, key, manifest, "test")
|
||||
manifest = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
archive = Archive(manifest, "test")
|
||||
id = archive.metadata.items[0]
|
||||
repository.put(id, b"corrupted items metadata stream chunk")
|
||||
repository.commit(compact=False)
|
||||
|
@ -1952,12 +1952,12 @@ class ArchiverTestCase(ArchiverTestCaseBase):
|
|||
self.cmd(f"--repo={self.repository_location}", "create", "--dry-run", "test", "input")
|
||||
# Make sure no archive has been created
|
||||
with Repository(self.repository_path) as repository:
|
||||
manifest, key = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
manifest = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
self.assert_equal(len(manifest.archives), 0)
|
||||
|
||||
def add_unknown_feature(self, operation):
|
||||
with Repository(self.repository_path, exclusive=True) as repository:
|
||||
manifest, key = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
manifest = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
manifest.config["feature_flags"] = {operation.value: {"mandatory": ["unknown-feature"]}}
|
||||
manifest.write()
|
||||
repository.commit(compact=False)
|
||||
|
@ -2034,8 +2034,8 @@ class ArchiverTestCase(ArchiverTestCaseBase):
|
|||
with Repository(self.repository_path, exclusive=True) as repository:
|
||||
if path_prefix:
|
||||
repository._location = Location(self.repository_location)
|
||||
manifest, key = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
with Cache(repository, key, manifest) as cache:
|
||||
manifest = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
with Cache(repository, manifest) as cache:
|
||||
cache.begin_txn()
|
||||
cache.cache_config.mandatory_features = {"unknown-feature"}
|
||||
cache.commit()
|
||||
|
@ -2059,8 +2059,8 @@ class ArchiverTestCase(ArchiverTestCaseBase):
|
|||
with Repository(self.repository_path, exclusive=True) as repository:
|
||||
if path_prefix:
|
||||
repository._location = Location(self.repository_location)
|
||||
manifest, key = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
with Cache(repository, key, manifest) as cache:
|
||||
manifest = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
with Cache(repository, manifest) as cache:
|
||||
assert cache.cache_config.mandatory_features == set()
|
||||
|
||||
def test_progress_on(self):
|
||||
|
@ -3060,11 +3060,11 @@ class ArchiverTestCase(ArchiverTestCaseBase):
|
|||
self.cmd(f"--repo={self.repository_location}", "check")
|
||||
# Then check that the cache on disk matches exactly what's in the repo.
|
||||
with self.open_repository() as repository:
|
||||
manifest, key = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
with Cache(repository, key, manifest, sync=False) as cache:
|
||||
manifest = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
with Cache(repository, manifest, sync=False) as cache:
|
||||
original_chunks = cache.chunks
|
||||
Cache.destroy(repository)
|
||||
with Cache(repository, key, manifest) as cache:
|
||||
with Cache(repository, manifest) as cache:
|
||||
correct_chunks = cache.chunks
|
||||
assert original_chunks is not correct_chunks
|
||||
seen = set()
|
||||
|
@ -3080,8 +3080,8 @@ class ArchiverTestCase(ArchiverTestCaseBase):
|
|||
self.cmd(f"--repo={self.repository_location}", "rcreate", RK_ENCRYPTION)
|
||||
self.cmd(f"--repo={self.repository_location}", "create", "test", "input")
|
||||
with self.open_repository() as repository:
|
||||
manifest, key = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
with Cache(repository, key, manifest, sync=False) as cache:
|
||||
manifest = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
with Cache(repository, manifest, sync=False) as cache:
|
||||
cache.begin_txn()
|
||||
cache.chunks.incref(list(cache.chunks.iteritems())[0][0])
|
||||
cache.commit()
|
||||
|
@ -3765,6 +3765,32 @@ id: 2 / e29442 3506da 4e1ea7 / 25f62a 5a3d41 - 02
|
|||
key = msgpack.unpackb(a2b_base64(repository.load_key()))
|
||||
assert key["algorithm"] == "argon2 chacha20-poly1305"
|
||||
|
||||
def test_transfer(self):
|
||||
def check_repo(repo_option):
|
||||
listing = self.cmd(repo_option, "rlist", "--short")
|
||||
assert "arch1" in listing
|
||||
assert "arch2" in listing
|
||||
listing = self.cmd(repo_option, "list", "--short", "arch1")
|
||||
assert "file1" in listing
|
||||
assert "dir2/file2" in listing
|
||||
self.cmd(repo_option, "check")
|
||||
|
||||
self.create_test_files()
|
||||
repo1 = f"--repo={self.repository_location}1"
|
||||
repo2 = f"--repo={self.repository_location}2"
|
||||
other_repo1 = f"--other-repo={self.repository_location}1"
|
||||
|
||||
self.cmd(repo1, "rcreate", RK_ENCRYPTION)
|
||||
self.cmd(repo1, "create", "arch1", "input")
|
||||
self.cmd(repo1, "create", "arch2", "input")
|
||||
check_repo(repo1)
|
||||
|
||||
self.cmd(repo2, "rcreate", RK_ENCRYPTION, other_repo1)
|
||||
self.cmd(repo2, "transfer", other_repo1, "--dry-run")
|
||||
self.cmd(repo2, "transfer", other_repo1)
|
||||
self.cmd(repo2, "transfer", other_repo1, "--dry-run")
|
||||
check_repo(repo2)
|
||||
|
||||
|
||||
@unittest.skipUnless("binary" in BORG_EXES, "no borg.exe available")
|
||||
class ArchiverTestCaseBinary(ArchiverTestCase):
|
||||
|
@ -3966,7 +3992,8 @@ class ArchiverCheckTestCase(ArchiverTestCaseBase):
|
|||
|
||||
def test_manifest_rebuild_duplicate_archive(self):
|
||||
archive, repository = self.open_archive("archive1")
|
||||
key = archive.key
|
||||
repo_objs = archive.repo_objs
|
||||
|
||||
with repository:
|
||||
manifest = repository.get(Manifest.MANIFEST_ID)
|
||||
corrupted_manifest = manifest + b"corrupted!"
|
||||
|
@ -3983,8 +4010,8 @@ class ArchiverCheckTestCase(ArchiverTestCaseBase):
|
|||
"version": 2,
|
||||
}
|
||||
)
|
||||
archive_id = key.id_hash(archive)
|
||||
repository.put(archive_id, key.encrypt(archive_id, archive))
|
||||
archive_id = repo_objs.id_hash(archive)
|
||||
repository.put(archive_id, repo_objs.format(archive_id, {}, archive))
|
||||
repository.commit(compact=False)
|
||||
self.cmd(f"--repo={self.repository_location}", "check", exit_code=1)
|
||||
self.cmd(f"--repo={self.repository_location}", "check", "--repair", exit_code=0)
|
||||
|
@ -4013,7 +4040,8 @@ class ArchiverCheckTestCase(ArchiverTestCaseBase):
|
|||
for item in archive.iter_items():
|
||||
if item.path.endswith("testsuite/archiver.py"):
|
||||
chunk = item.chunks[-1]
|
||||
data = repository.get(chunk.id) + b"1234"
|
||||
data = repository.get(chunk.id)
|
||||
data = data[0:100] + b"x" + data[101:]
|
||||
repository.put(chunk.id, data)
|
||||
break
|
||||
repository.commit(compact=False)
|
||||
|
@ -4042,45 +4070,43 @@ class ArchiverCheckTestCase(ArchiverTestCaseBase):
|
|||
class ManifestAuthenticationTest(ArchiverTestCaseBase):
|
||||
def spoof_manifest(self, repository):
|
||||
with repository:
|
||||
_, key = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
repository.put(
|
||||
manifest = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
cdata = manifest.repo_objs.format(
|
||||
Manifest.MANIFEST_ID,
|
||||
key.encrypt(
|
||||
Manifest.MANIFEST_ID,
|
||||
msgpack.packb(
|
||||
{
|
||||
"version": 1,
|
||||
"archives": {},
|
||||
"config": {},
|
||||
"timestamp": (datetime.now(tz=timezone.utc) + timedelta(days=1)).isoformat(
|
||||
timespec="microseconds"
|
||||
),
|
||||
}
|
||||
),
|
||||
{},
|
||||
msgpack.packb(
|
||||
{
|
||||
"version": 1,
|
||||
"archives": {},
|
||||
"config": {},
|
||||
"timestamp": (datetime.now(tz=timezone.utc) + timedelta(days=1)).isoformat(
|
||||
timespec="microseconds"
|
||||
),
|
||||
}
|
||||
),
|
||||
)
|
||||
repository.put(Manifest.MANIFEST_ID, cdata)
|
||||
repository.commit(compact=False)
|
||||
|
||||
def test_fresh_init_tam_required(self):
|
||||
self.cmd(f"--repo={self.repository_location}", "rcreate", RK_ENCRYPTION)
|
||||
repository = Repository(self.repository_path, exclusive=True)
|
||||
with repository:
|
||||
manifest, key = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
repository.put(
|
||||
manifest = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
|
||||
cdata = manifest.repo_objs.format(
|
||||
Manifest.MANIFEST_ID,
|
||||
key.encrypt(
|
||||
Manifest.MANIFEST_ID,
|
||||
msgpack.packb(
|
||||
{
|
||||
"version": 1,
|
||||
"archives": {},
|
||||
"timestamp": (datetime.now(tz=timezone.utc) + timedelta(days=1)).isoformat(
|
||||
timespec="microseconds"
|
||||
),
|
||||
}
|
||||
),
|
||||
{},
|
||||
msgpack.packb(
|
||||
{
|
||||
"version": 1,
|
||||
"archives": {},
|
||||
"timestamp": (datetime.now(tz=timezone.utc) + timedelta(days=1)).isoformat(
|
||||
timespec="microseconds"
|
||||
),
|
||||
}
|
||||
),
|
||||
)
|
||||
repository.put(Manifest.MANIFEST_ID, cdata)
|
||||
repository.commit(compact=False)
|
||||
|
||||
with pytest.raises(TAMRequiredError):
|
||||
|
|
|
@ -9,7 +9,6 @@ from .hashindex import H
|
|||
from .key import TestKey
|
||||
from ..archive import Statistics
|
||||
from ..cache import AdHocCache
|
||||
from ..compress import CompressionSpec
|
||||
from ..crypto.key import AESOCBRepoKey
|
||||
from ..hashindex import ChunkIndex, CacheSynchronizer
|
||||
from ..manifest import Manifest
|
||||
|
@ -167,17 +166,16 @@ class TestAdHocCache:
|
|||
def key(self, repository, monkeypatch):
|
||||
monkeypatch.setenv("BORG_PASSPHRASE", "test")
|
||||
key = AESOCBRepoKey.create(repository, TestKey.MockArgs())
|
||||
key.compressor = CompressionSpec("none").compressor
|
||||
return key
|
||||
|
||||
@pytest.fixture
|
||||
def manifest(self, repository, key):
|
||||
Manifest(key, repository).write()
|
||||
return Manifest.load(repository, key=key, operations=Manifest.NO_OPERATION_CHECK)[0]
|
||||
return Manifest.load(repository, key=key, operations=Manifest.NO_OPERATION_CHECK)
|
||||
|
||||
@pytest.fixture
|
||||
def cache(self, repository, key, manifest):
|
||||
return AdHocCache(repository, key, manifest)
|
||||
return AdHocCache(manifest)
|
||||
|
||||
def test_does_not_contain_manifest(self, cache):
|
||||
assert not cache.seen_chunk(Manifest.MANIFEST_ID)
|
||||
|
@ -189,14 +187,14 @@ class TestAdHocCache:
|
|||
|
||||
def test_does_not_overwrite(self, cache):
|
||||
with pytest.raises(AssertionError):
|
||||
cache.add_chunk(H(1), b"5678", Statistics(), overwrite=True)
|
||||
cache.add_chunk(H(1), {}, b"5678", stats=Statistics(), overwrite=True)
|
||||
|
||||
def test_seen_chunk_add_chunk_size(self, cache):
|
||||
assert cache.add_chunk(H(1), b"5678", Statistics()) == (H(1), 4)
|
||||
assert cache.add_chunk(H(1), {}, b"5678", stats=Statistics()) == (H(1), 4)
|
||||
|
||||
def test_deletes_chunks_during_lifetime(self, cache, repository):
|
||||
"""E.g. checkpoint archives"""
|
||||
cache.add_chunk(H(5), b"1010", Statistics())
|
||||
cache.add_chunk(H(5), {}, b"1010", stats=Statistics())
|
||||
assert cache.seen_chunk(H(5)) == 1
|
||||
cache.chunk_decref(H(5), Statistics())
|
||||
assert not cache.seen_chunk(H(5))
|
||||
|
@ -218,10 +216,10 @@ class TestAdHocCache:
|
|||
assert not hasattr(cache, "chunks")
|
||||
|
||||
def test_incref_after_add_chunk(self, cache):
|
||||
assert cache.add_chunk(H(3), b"5678", Statistics()) == (H(3), 4)
|
||||
assert cache.add_chunk(H(3), {}, b"5678", stats=Statistics()) == (H(3), 4)
|
||||
assert cache.chunk_incref(H(3), Statistics()) == (H(3), 4)
|
||||
|
||||
def test_existing_incref_after_add_chunk(self, cache):
|
||||
"""This case occurs with part files, see Archive.chunk_file."""
|
||||
assert cache.add_chunk(H(1), b"5678", Statistics()) == (H(1), 4)
|
||||
assert cache.add_chunk(H(1), {}, b"5678", stats=Statistics()) == (H(1), 4)
|
||||
assert cache.chunk_incref(H(1), Statistics()) == (H(1), 4)
|
||||
|
|
|
@ -29,19 +29,19 @@ def test_get_compressor():
|
|||
|
||||
def test_cnull():
|
||||
c = get_compressor(name="none")
|
||||
cdata = c.compress(data)
|
||||
assert len(cdata) > len(data)
|
||||
meta, cdata = c.compress({}, data)
|
||||
assert len(cdata) >= len(data)
|
||||
assert data in cdata # it's not compressed and just in there 1:1
|
||||
assert data == c.decompress(cdata)
|
||||
assert data == Compressor(**params).decompress(cdata) # autodetect
|
||||
assert data == c.decompress(meta, cdata)[1]
|
||||
assert data == Compressor(**params).decompress(meta, cdata)[1] # autodetect
|
||||
|
||||
|
||||
def test_lz4():
|
||||
c = get_compressor(name="lz4")
|
||||
cdata = c.compress(data)
|
||||
meta, cdata = c.compress({}, data)
|
||||
assert len(cdata) < len(data)
|
||||
assert data == c.decompress(cdata)
|
||||
assert data == Compressor(**params).decompress(cdata) # autodetect
|
||||
assert data == c.decompress(meta, cdata)[1]
|
||||
assert data == Compressor(**params).decompress(meta, cdata)[1] # autodetect
|
||||
|
||||
|
||||
def test_lz4_buffer_allocation(monkeypatch):
|
||||
|
@ -51,56 +51,56 @@ def test_lz4_buffer_allocation(monkeypatch):
|
|||
data = os.urandom(5 * 2**20) * 10 # 50MiB badly compressible data
|
||||
assert len(data) == 50 * 2**20
|
||||
c = Compressor("lz4")
|
||||
cdata = c.compress(data)
|
||||
assert len(cdata) > len(data)
|
||||
assert data == c.decompress(cdata)
|
||||
meta, cdata = c.compress({}, data)
|
||||
assert len(cdata) >= len(data)
|
||||
assert data == c.decompress(meta, cdata)[1]
|
||||
|
||||
|
||||
def test_zlib():
|
||||
c = get_compressor(name="zlib")
|
||||
cdata = c.compress(data)
|
||||
meta, cdata = c.compress({}, data)
|
||||
assert len(cdata) < len(data)
|
||||
assert data == c.decompress(cdata)
|
||||
assert data == Compressor(**params).decompress(cdata) # autodetect
|
||||
assert data == c.decompress(meta, cdata)[1]
|
||||
assert data == Compressor(**params).decompress(meta, cdata)[1] # autodetect
|
||||
|
||||
|
||||
def test_lzma():
|
||||
if lzma is None:
|
||||
pytest.skip("No lzma support found.")
|
||||
c = get_compressor(name="lzma")
|
||||
cdata = c.compress(data)
|
||||
meta, cdata = c.compress({}, data)
|
||||
assert len(cdata) < len(data)
|
||||
assert data == c.decompress(cdata)
|
||||
assert data == Compressor(**params).decompress(cdata) # autodetect
|
||||
assert data == c.decompress(meta, cdata)[1]
|
||||
assert data == Compressor(**params).decompress(meta, cdata)[1] # autodetect
|
||||
|
||||
|
||||
def test_zstd():
|
||||
c = get_compressor(name="zstd")
|
||||
cdata = c.compress(data)
|
||||
meta, cdata = c.compress({}, data)
|
||||
assert len(cdata) < len(data)
|
||||
assert data == c.decompress(cdata)
|
||||
assert data == Compressor(**params).decompress(cdata) # autodetect
|
||||
assert data == c.decompress(meta, cdata)[1]
|
||||
assert data == Compressor(**params).decompress(meta, cdata)[1] # autodetect
|
||||
|
||||
|
||||
def test_autodetect_invalid():
|
||||
with pytest.raises(ValueError):
|
||||
Compressor(**params).decompress(b"\xff\xfftotalcrap")
|
||||
Compressor(**params, legacy_mode=True).decompress({}, b"\xff\xfftotalcrap")
|
||||
with pytest.raises(ValueError):
|
||||
Compressor(**params).decompress(b"\x08\x00notreallyzlib")
|
||||
Compressor(**params, legacy_mode=True).decompress({}, b"\x08\x00notreallyzlib")
|
||||
|
||||
|
||||
def test_zlib_legacy_compat():
|
||||
# for compatibility reasons, we do not add an extra header for zlib,
|
||||
# nor do we expect one when decompressing / autodetecting
|
||||
for level in range(10):
|
||||
c = get_compressor(name="zlib_legacy", level=level)
|
||||
cdata1 = c.compress(data)
|
||||
c = get_compressor(name="zlib_legacy", level=level, legacy_mode=True)
|
||||
meta1, cdata1 = c.compress({}, data)
|
||||
cdata2 = zlib.compress(data, level)
|
||||
assert cdata1 == cdata2
|
||||
data2 = c.decompress(cdata2)
|
||||
assert data == data2
|
||||
data2 = Compressor(**params).decompress(cdata2)
|
||||
meta2, data2 = c.decompress({}, cdata2)
|
||||
assert data == data2
|
||||
# _, data2 = Compressor(**params).decompress({}, cdata2)
|
||||
# assert data == data2
|
||||
|
||||
|
||||
def test_compressor():
|
||||
|
@ -122,7 +122,17 @@ def test_compressor():
|
|||
]
|
||||
for params in params_list:
|
||||
c = Compressor(**params)
|
||||
assert data == c.decompress(c.compress(data))
|
||||
meta_c, data_compressed = c.compress({}, data)
|
||||
assert "ctype" in meta_c
|
||||
assert "clevel" in meta_c
|
||||
assert meta_c["csize"] == len(data_compressed)
|
||||
assert meta_c["size"] == len(data)
|
||||
meta_d, data_decompressed = c.decompress(meta_c, data_compressed)
|
||||
assert data == data_decompressed
|
||||
assert "ctype" in meta_d
|
||||
assert "clevel" in meta_d
|
||||
assert meta_d["csize"] == len(data_compressed)
|
||||
assert meta_d["size"] == len(data)
|
||||
|
||||
|
||||
def test_auto():
|
||||
|
@ -130,72 +140,89 @@ def test_auto():
|
|||
compressor_lz4 = CompressionSpec("lz4").compressor
|
||||
compressor_zlib = CompressionSpec("zlib,9").compressor
|
||||
data = bytes(500)
|
||||
compressed_auto_zlib = compressor_auto_zlib.compress(data)
|
||||
compressed_lz4 = compressor_lz4.compress(data)
|
||||
compressed_zlib = compressor_zlib.compress(data)
|
||||
meta, compressed_auto_zlib = compressor_auto_zlib.compress({}, data)
|
||||
_, compressed_lz4 = compressor_lz4.compress({}, data)
|
||||
_, compressed_zlib = compressor_zlib.compress({}, data)
|
||||
ratio = len(compressed_zlib) / len(compressed_lz4)
|
||||
assert Compressor.detect(compressed_auto_zlib)[0] == ZLIB if ratio < 0.99 else LZ4
|
||||
assert meta["ctype"] == ZLIB.ID if ratio < 0.99 else LZ4.ID
|
||||
assert meta["clevel"] == 9 if ratio < 0.99 else 255
|
||||
assert meta["csize"] == len(compressed_auto_zlib)
|
||||
|
||||
data = b"\x00\xb8\xa3\xa2-O\xe1i\xb6\x12\x03\xc21\xf3\x8a\xf78\\\x01\xa5b\x07\x95\xbeE\xf8\xa3\x9ahm\xb1~"
|
||||
compressed = compressor_auto_zlib.compress(data)
|
||||
assert Compressor.detect(compressed)[0] == CNONE
|
||||
meta, compressed = compressor_auto_zlib.compress(dict(meta), data)
|
||||
assert meta["ctype"] == CNONE.ID
|
||||
assert meta["clevel"] == 255
|
||||
assert meta["csize"] == len(compressed)
|
||||
|
||||
|
||||
def test_obfuscate():
|
||||
compressor = CompressionSpec("obfuscate,1,none").compressor
|
||||
data = bytes(10000)
|
||||
compressed = compressor.compress(data)
|
||||
# 2 id bytes compression, 2 id bytes obfuscator. 4 length bytes
|
||||
assert len(data) + 8 <= len(compressed) <= len(data) * 101 + 8
|
||||
_, compressed = compressor.compress({}, data)
|
||||
assert len(data) <= len(compressed) <= len(data) * 101
|
||||
# compressing 100 times the same data should give at least 50 different result sizes
|
||||
assert len({len(compressor.compress(data)) for i in range(100)}) > 50
|
||||
assert len({len(compressor.compress({}, data)[1]) for i in range(100)}) > 50
|
||||
|
||||
cs = CompressionSpec("obfuscate,2,lz4")
|
||||
assert isinstance(cs.inner.compressor, LZ4)
|
||||
compressor = cs.compressor
|
||||
data = bytes(10000)
|
||||
compressed = compressor.compress(data)
|
||||
# 2 id bytes compression, 2 id bytes obfuscator. 4 length bytes
|
||||
_, compressed = compressor.compress({}, data)
|
||||
min_compress, max_compress = 0.2, 0.001 # estimate compression factor outer boundaries
|
||||
assert max_compress * len(data) + 8 <= len(compressed) <= min_compress * len(data) * 1001 + 8
|
||||
assert max_compress * len(data) <= len(compressed) <= min_compress * len(data) * 1001
|
||||
# compressing 100 times the same data should give multiple different result sizes
|
||||
assert len({len(compressor.compress(data)) for i in range(100)}) > 10
|
||||
assert len({len(compressor.compress({}, data)[1]) for i in range(100)}) > 10
|
||||
|
||||
cs = CompressionSpec("obfuscate,6,zstd,3")
|
||||
assert isinstance(cs.inner.compressor, ZSTD)
|
||||
compressor = cs.compressor
|
||||
data = bytes(10000)
|
||||
compressed = compressor.compress(data)
|
||||
# 2 id bytes compression, 2 id bytes obfuscator. 4 length bytes
|
||||
_, compressed = compressor.compress({}, data)
|
||||
min_compress, max_compress = 0.2, 0.001 # estimate compression factor outer boundaries
|
||||
assert max_compress * len(data) + 8 <= len(compressed) <= min_compress * len(data) * 10000001 + 8
|
||||
assert max_compress * len(data) <= len(compressed) <= min_compress * len(data) * 10000001
|
||||
# compressing 100 times the same data should give multiple different result sizes
|
||||
assert len({len(compressor.compress(data)) for i in range(100)}) > 90
|
||||
assert len({len(compressor.compress({}, data)[1]) for i in range(100)}) > 90
|
||||
|
||||
cs = CompressionSpec("obfuscate,2,auto,zstd,10")
|
||||
assert isinstance(cs.inner.compressor, Auto)
|
||||
compressor = cs.compressor
|
||||
data = bytes(10000)
|
||||
compressed = compressor.compress(data)
|
||||
# 2 id bytes compression, 2 id bytes obfuscator. 4 length bytes
|
||||
_, compressed = compressor.compress({}, data)
|
||||
min_compress, max_compress = 0.2, 0.001 # estimate compression factor outer boundaries
|
||||
assert max_compress * len(data) + 8 <= len(compressed) <= min_compress * len(data) * 1001 + 8
|
||||
assert max_compress * len(data) <= len(compressed) <= min_compress * len(data) * 1001
|
||||
# compressing 100 times the same data should give multiple different result sizes
|
||||
assert len({len(compressor.compress(data)) for i in range(100)}) > 10
|
||||
assert len({len(compressor.compress({}, data)[1]) for i in range(100)}) > 10
|
||||
|
||||
cs = CompressionSpec("obfuscate,110,none")
|
||||
assert isinstance(cs.inner.compressor, CNONE)
|
||||
compressor = cs.compressor
|
||||
data = bytes(1000)
|
||||
compressed = compressor.compress(data)
|
||||
# N blocks + 2 id bytes obfuscator. 4 length bytes
|
||||
# The 'none' compressor also adds 2 id bytes
|
||||
assert 6 + 2 + 1000 <= len(compressed) <= 6 + 2 + 1000 + 1024
|
||||
_, compressed = compressor.compress({}, data)
|
||||
assert 1000 <= len(compressed) <= 1000 + 1024
|
||||
data = bytes(1100)
|
||||
compressed = compressor.compress(data)
|
||||
# N blocks + 2 id bytes obfuscator. 4 length bytes
|
||||
# The 'none' compressor also adds 2 id bytes
|
||||
assert 6 + 2 + 1100 <= len(compressed) <= 6 + 2 + 1100 + 1024
|
||||
_, compressed = compressor.compress({}, data)
|
||||
assert 1100 <= len(compressed) <= 1100 + 1024
|
||||
|
||||
|
||||
def test_obfuscate_meta():
|
||||
compressor = CompressionSpec("obfuscate,3,lz4").compressor
|
||||
meta_in = {}
|
||||
data = bytes(10000)
|
||||
meta_out, compressed = compressor.compress(meta_in, data)
|
||||
assert "ctype" not in meta_in # do not modify dict of caller
|
||||
assert "ctype" in meta_out
|
||||
assert meta_out["ctype"] == LZ4.ID
|
||||
assert "clevel" in meta_out
|
||||
assert meta_out["clevel"] == 0xFF
|
||||
assert "csize" in meta_out
|
||||
csize = meta_out["csize"]
|
||||
assert csize == len(compressed) # this is the overall size
|
||||
assert "psize" in meta_out
|
||||
psize = meta_out["psize"]
|
||||
assert 0 < psize < 100
|
||||
assert csize - psize >= 0 # there is a obfuscation trailer
|
||||
trailer = compressed[psize:]
|
||||
assert not trailer or set(trailer) == {0} # trailer is all-zero-bytes
|
||||
|
||||
|
||||
def test_compression_specs():
|
||||
|
|
|
@ -8,6 +8,7 @@ import pytest
|
|||
from ..crypto.key import bin_to_hex
|
||||
from ..crypto.key import PlaintextKey, AuthenticatedKey, Blake2AuthenticatedKey
|
||||
from ..crypto.key import RepoKey, KeyfileKey, Blake2RepoKey, Blake2KeyfileKey
|
||||
from ..crypto.key import AEADKeyBase
|
||||
from ..crypto.key import AESOCBRepoKey, AESOCBKeyfileKey, CHPORepoKey, CHPOKeyfileKey
|
||||
from ..crypto.key import Blake2AESOCBRepoKey, Blake2AESOCBKeyfileKey, Blake2CHPORepoKey, Blake2CHPOKeyfileKey
|
||||
from ..crypto.key import ID_HMAC_SHA_256, ID_BLAKE2b_256
|
||||
|
@ -42,15 +43,8 @@ class TestKey:
|
|||
F84MsMMiqpbz4KVICeBZhfAaTPs4W7BC63qml0ZXJhdGlvbnPOAAGGoKRzYWx02gAgLENQ
|
||||
2uVCoR7EnAoiRzn8J+orbojKtJlNCnQ31SSC8rendmVyc2lvbgE=""".strip()
|
||||
|
||||
keyfile2_cdata = unhexlify(
|
||||
re.sub(
|
||||
r"\W",
|
||||
"",
|
||||
"""
|
||||
0055f161493fcfc16276e8c31493c4641e1eb19a79d0326fad0291e5a9c98e5933
|
||||
00000000000003e8d21eaf9b86c297a8cd56432e1915bb
|
||||
""",
|
||||
)
|
||||
keyfile2_cdata = bytes.fromhex(
|
||||
"003be7d57280d1a42add9f3f36ea363bbc5e9349ad01ddec0634a54dd02959e70500000000000003ec063d2cbcacba6b"
|
||||
)
|
||||
keyfile2_id = unhexlify("c3fbf14bc001ebcc3cd86e696c13482ed071740927cd7cbe1b01b4bfcee49314")
|
||||
|
||||
|
@ -69,7 +63,7 @@ class TestKey:
|
|||
qkPqtDDxs2j/T7+ndmVyc2lvbgE=""".strip()
|
||||
|
||||
keyfile_blake2_cdata = bytes.fromhex(
|
||||
"04fdf9475cf2323c0ba7a99ddc011064f2e7d039f539f2e448" "0e6f5fc6ff9993d604040404040404098c8cee1c6db8c28947"
|
||||
"04d6040f5ef80e0a8ac92badcbe3dee83b7a6b53d5c9a58c4eed14964cb10ef591040404040404040d1e65cc1f435027"
|
||||
)
|
||||
# Verified against b2sum. Entire string passed to BLAKE2, including the padded 64 byte key contained in
|
||||
# keyfile_blake2_key_file above is
|
||||
|
@ -224,7 +218,8 @@ class TestKey:
|
|||
data = bytearray(self.keyfile2_cdata)
|
||||
id = bytearray(key.id_hash(data)) # corrupt chunk id
|
||||
id[12] = 0
|
||||
key.decrypt(id, data)
|
||||
plaintext = key.decrypt(id, data)
|
||||
key.assert_id(id, plaintext)
|
||||
|
||||
def test_roundtrip(self, key):
|
||||
repository = key.repository
|
||||
|
@ -237,45 +232,18 @@ class TestKey:
|
|||
decrypted = loaded_key.decrypt(id, encrypted)
|
||||
assert decrypted == plaintext
|
||||
|
||||
def test_decrypt_decompress(self, key):
|
||||
plaintext = b"123456789"
|
||||
id = key.id_hash(plaintext)
|
||||
encrypted = key.encrypt(id, plaintext)
|
||||
assert key.decrypt(id, encrypted, decompress=False) != plaintext
|
||||
assert key.decrypt(id, encrypted) == plaintext
|
||||
|
||||
def test_assert_id(self, key):
|
||||
plaintext = b"123456789"
|
||||
id = key.id_hash(plaintext)
|
||||
key.assert_id(id, plaintext)
|
||||
id_changed = bytearray(id)
|
||||
id_changed[0] ^= 1
|
||||
with pytest.raises(IntegrityError):
|
||||
key.assert_id(id_changed, plaintext)
|
||||
plaintext_changed = plaintext + b"1"
|
||||
with pytest.raises(IntegrityError):
|
||||
key.assert_id(id, plaintext_changed)
|
||||
|
||||
def test_getting_wrong_chunk_fails(self, key):
|
||||
# for the new AEAD crypto, we provide the chunk id as AAD when encrypting/authenticating,
|
||||
# we provide the id **we want** as AAD when authenticating/decrypting the data we got from the repo.
|
||||
# only if the id used for encrypting matches the id we want, the AEAD crypto authentication will succeed.
|
||||
# thus, there is no need any more for calling self._assert_id() for the new crypto.
|
||||
# the old crypto as well as plaintext and authenticated modes still need to call self._assert_id().
|
||||
plaintext_wanted = b"123456789"
|
||||
id_wanted = key.id_hash(plaintext_wanted)
|
||||
ciphertext_wanted = key.encrypt(id_wanted, plaintext_wanted)
|
||||
plaintext_other = b"xxxxxxxxx"
|
||||
id_other = key.id_hash(plaintext_other)
|
||||
ciphertext_other = key.encrypt(id_other, plaintext_other)
|
||||
# both ciphertexts are authentic and decrypting them should succeed:
|
||||
key.decrypt(id_wanted, ciphertext_wanted)
|
||||
key.decrypt(id_other, ciphertext_other)
|
||||
# but if we wanted the one and got the other, it must fail.
|
||||
# the new crypto will fail due to AEAD auth failure,
|
||||
# the old crypto and plaintext, authenticated modes will fail due to ._assert_id() check failing:
|
||||
with pytest.raises(IntegrityErrorBase):
|
||||
key.decrypt(id_wanted, ciphertext_other)
|
||||
if not isinstance(key, AEADKeyBase):
|
||||
with pytest.raises(IntegrityError):
|
||||
key.assert_id(id_changed, plaintext)
|
||||
plaintext_changed = plaintext + b"1"
|
||||
with pytest.raises(IntegrityError):
|
||||
key.assert_id(id, plaintext_changed)
|
||||
|
||||
def test_authenticated_encrypt(self, monkeypatch):
|
||||
monkeypatch.setenv("BORG_PASSPHRASE", "test")
|
||||
|
@ -285,8 +253,8 @@ class TestKey:
|
|||
plaintext = b"123456789"
|
||||
id = key.id_hash(plaintext)
|
||||
authenticated = key.encrypt(id, plaintext)
|
||||
# 0x07 is the key TYPE, \x00ff identifies no compression / unknown level.
|
||||
assert authenticated == b"\x07\x00\xff" + plaintext
|
||||
# 0x07 is the key TYPE.
|
||||
assert authenticated == b"\x07" + plaintext
|
||||
|
||||
def test_blake2_authenticated_encrypt(self, monkeypatch):
|
||||
monkeypatch.setenv("BORG_PASSPHRASE", "test")
|
||||
|
@ -296,8 +264,8 @@ class TestKey:
|
|||
plaintext = b"123456789"
|
||||
id = key.id_hash(plaintext)
|
||||
authenticated = key.encrypt(id, plaintext)
|
||||
# 0x06 is the key TYPE, 0x00ff identifies no compression / unknown level.
|
||||
assert authenticated == b"\x06\x00\xff" + plaintext
|
||||
# 0x06 is the key TYPE.
|
||||
assert authenticated == b"\x06" + plaintext
|
||||
|
||||
|
||||
class TestTAM:
|
||||
|
|
|
@ -9,9 +9,10 @@ import pytest
|
|||
from ..remote import SleepingBandwidthLimiter, RepositoryCache, cache_if_remote
|
||||
from ..repository import Repository
|
||||
from ..crypto.key import PlaintextKey
|
||||
from ..compress import CompressionSpec
|
||||
from ..helpers import IntegrityError
|
||||
from ..repoobj import RepoObj
|
||||
from .hashindex import H
|
||||
from .repository import fchunk, pdchunk
|
||||
from .key import TestKey
|
||||
|
||||
|
||||
|
@ -74,9 +75,9 @@ class TestRepositoryCache:
|
|||
def repository(self, tmpdir):
|
||||
self.repository_location = os.path.join(str(tmpdir), "repository")
|
||||
with Repository(self.repository_location, exclusive=True, create=True) as repository:
|
||||
repository.put(H(1), b"1234")
|
||||
repository.put(H(2), b"5678")
|
||||
repository.put(H(3), bytes(100))
|
||||
repository.put(H(1), fchunk(b"1234"))
|
||||
repository.put(H(2), fchunk(b"5678"))
|
||||
repository.put(H(3), fchunk(bytes(100)))
|
||||
yield repository
|
||||
|
||||
@pytest.fixture
|
||||
|
@ -85,19 +86,55 @@ class TestRepositoryCache:
|
|||
|
||||
def test_simple(self, cache: RepositoryCache):
|
||||
# Single get()s are not cached, since they are used for unique objects like archives.
|
||||
assert cache.get(H(1)) == b"1234"
|
||||
assert pdchunk(cache.get(H(1))) == b"1234"
|
||||
assert cache.misses == 1
|
||||
assert cache.hits == 0
|
||||
|
||||
assert list(cache.get_many([H(1)])) == [b"1234"]
|
||||
assert [pdchunk(ch) for ch in cache.get_many([H(1)])] == [b"1234"]
|
||||
assert cache.misses == 2
|
||||
assert cache.hits == 0
|
||||
|
||||
assert list(cache.get_many([H(1)])) == [b"1234"]
|
||||
assert [pdchunk(ch) for ch in cache.get_many([H(1)])] == [b"1234"]
|
||||
assert cache.misses == 2
|
||||
assert cache.hits == 1
|
||||
|
||||
assert cache.get(H(1)) == b"1234"
|
||||
assert pdchunk(cache.get(H(1))) == b"1234"
|
||||
assert cache.misses == 2
|
||||
assert cache.hits == 2
|
||||
|
||||
def test_meta(self, cache: RepositoryCache):
|
||||
# same as test_simple, but not reading the chunk data (metadata only).
|
||||
# Single get()s are not cached, since they are used for unique objects like archives.
|
||||
assert pdchunk(cache.get(H(1), read_data=False)) == b""
|
||||
assert cache.misses == 1
|
||||
assert cache.hits == 0
|
||||
|
||||
assert [pdchunk(ch) for ch in cache.get_many([H(1)], read_data=False)] == [b""]
|
||||
assert cache.misses == 2
|
||||
assert cache.hits == 0
|
||||
|
||||
assert [pdchunk(ch) for ch in cache.get_many([H(1)], read_data=False)] == [b""]
|
||||
assert cache.misses == 2
|
||||
assert cache.hits == 1
|
||||
|
||||
assert pdchunk(cache.get(H(1), read_data=False)) == b""
|
||||
assert cache.misses == 2
|
||||
assert cache.hits == 2
|
||||
|
||||
def test_mixed(self, cache: RepositoryCache):
|
||||
assert [pdchunk(ch) for ch in cache.get_many([H(1)], read_data=False)] == [b""]
|
||||
assert cache.misses == 1
|
||||
assert cache.hits == 0
|
||||
|
||||
assert [pdchunk(ch) for ch in cache.get_many([H(1)], read_data=True)] == [b"1234"]
|
||||
assert cache.misses == 2
|
||||
assert cache.hits == 0
|
||||
|
||||
assert [pdchunk(ch) for ch in cache.get_many([H(1)], read_data=False)] == [b""]
|
||||
assert cache.misses == 2
|
||||
assert cache.hits == 1
|
||||
|
||||
assert [pdchunk(ch) for ch in cache.get_many([H(1)], read_data=True)] == [b"1234"]
|
||||
assert cache.misses == 2
|
||||
assert cache.hits == 2
|
||||
|
||||
|
@ -105,11 +142,11 @@ class TestRepositoryCache:
|
|||
def query_size_limit():
|
||||
cache.size_limit = 0
|
||||
|
||||
assert list(cache.get_many([H(1), H(2)])) == [b"1234", b"5678"]
|
||||
assert [pdchunk(ch) for ch in cache.get_many([H(1), H(2)])] == [b"1234", b"5678"]
|
||||
assert cache.misses == 2
|
||||
assert cache.evictions == 0
|
||||
iterator = cache.get_many([H(1), H(3), H(2)])
|
||||
assert next(iterator) == b"1234"
|
||||
assert pdchunk(next(iterator)) == b"1234"
|
||||
|
||||
# Force cache to back off
|
||||
qsl = cache.query_size_limit
|
||||
|
@ -120,11 +157,11 @@ class TestRepositoryCache:
|
|||
assert cache.evictions == 2
|
||||
assert H(1) not in cache.cache
|
||||
assert H(2) not in cache.cache
|
||||
assert next(iterator) == bytes(100)
|
||||
assert pdchunk(next(iterator)) == bytes(100)
|
||||
assert cache.slow_misses == 0
|
||||
# Since H(2) was in the cache when we called get_many(), but has
|
||||
# been evicted during iterating the generator, it will be a slow miss.
|
||||
assert next(iterator) == b"5678"
|
||||
assert pdchunk(next(iterator)) == b"5678"
|
||||
assert cache.slow_misses == 1
|
||||
|
||||
def test_enospc(self, cache: RepositoryCache):
|
||||
|
@ -145,52 +182,56 @@ class TestRepositoryCache:
|
|||
pass
|
||||
|
||||
iterator = cache.get_many([H(1), H(2), H(3)])
|
||||
assert next(iterator) == b"1234"
|
||||
assert pdchunk(next(iterator)) == b"1234"
|
||||
|
||||
with patch("builtins.open", enospc_open):
|
||||
assert next(iterator) == b"5678"
|
||||
assert pdchunk(next(iterator)) == b"5678"
|
||||
assert cache.enospc == 1
|
||||
# We didn't patch query_size_limit which would set size_limit to some low
|
||||
# value, so nothing was actually evicted.
|
||||
assert cache.evictions == 0
|
||||
|
||||
assert next(iterator) == bytes(100)
|
||||
assert pdchunk(next(iterator)) == bytes(100)
|
||||
|
||||
@pytest.fixture
|
||||
def key(self, repository, monkeypatch):
|
||||
monkeypatch.setenv("BORG_PASSPHRASE", "test")
|
||||
key = PlaintextKey.create(repository, TestKey.MockArgs())
|
||||
key.compressor = CompressionSpec("none").compressor
|
||||
return key
|
||||
|
||||
def _put_encrypted_object(self, key, repository, data):
|
||||
id_ = key.id_hash(data)
|
||||
repository.put(id_, key.encrypt(id_, data))
|
||||
@pytest.fixture
|
||||
def repo_objs(self, key):
|
||||
return RepoObj(key)
|
||||
|
||||
def _put_encrypted_object(self, repo_objs, repository, data):
|
||||
id_ = repo_objs.id_hash(data)
|
||||
repository.put(id_, repo_objs.format(id_, {}, data))
|
||||
return id_
|
||||
|
||||
@pytest.fixture
|
||||
def H1(self, key, repository):
|
||||
return self._put_encrypted_object(key, repository, b"1234")
|
||||
def H1(self, repo_objs, repository):
|
||||
return self._put_encrypted_object(repo_objs, repository, b"1234")
|
||||
|
||||
@pytest.fixture
|
||||
def H2(self, key, repository):
|
||||
return self._put_encrypted_object(key, repository, b"5678")
|
||||
def H2(self, repo_objs, repository):
|
||||
return self._put_encrypted_object(repo_objs, repository, b"5678")
|
||||
|
||||
@pytest.fixture
|
||||
def H3(self, key, repository):
|
||||
return self._put_encrypted_object(key, repository, bytes(100))
|
||||
def H3(self, repo_objs, repository):
|
||||
return self._put_encrypted_object(repo_objs, repository, bytes(100))
|
||||
|
||||
@pytest.fixture
|
||||
def decrypted_cache(self, key, repository):
|
||||
return cache_if_remote(repository, decrypted_cache=key, force_cache=True)
|
||||
def decrypted_cache(self, repo_objs, repository):
|
||||
return cache_if_remote(repository, decrypted_cache=repo_objs, force_cache=True)
|
||||
|
||||
def test_cache_corruption(self, decrypted_cache: RepositoryCache, H1, H2, H3):
|
||||
list(decrypted_cache.get_many([H1, H2, H3]))
|
||||
|
||||
iterator = decrypted_cache.get_many([H1, H2, H3])
|
||||
assert next(iterator) == (7, b"1234")
|
||||
assert next(iterator) == (4, b"1234")
|
||||
|
||||
with open(decrypted_cache.key_filename(H2), "a+b") as fd:
|
||||
pkey = decrypted_cache.prefixed_key(H2, complete=True)
|
||||
with open(decrypted_cache.key_filename(pkey), "a+b") as fd:
|
||||
fd.seek(-1, io.SEEK_END)
|
||||
corrupted = (int.from_bytes(fd.read(), "little") ^ 2).to_bytes(1, "little")
|
||||
fd.seek(-1, io.SEEK_END)
|
||||
|
@ -198,4 +239,4 @@ class TestRepositoryCache:
|
|||
fd.truncate()
|
||||
|
||||
with pytest.raises(IntegrityError):
|
||||
assert next(iterator) == (7, b"5678")
|
||||
assert next(iterator) == (4, b"5678")
|
||||
|
|
|
@ -0,0 +1,95 @@
|
|||
import pytest
|
||||
|
||||
from ..crypto.key import PlaintextKey
|
||||
from ..repository import Repository
|
||||
from ..repoobj import RepoObj, RepoObj1
|
||||
from ..compress import LZ4
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def repository(tmpdir):
|
||||
return Repository(tmpdir, create=True)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def key(repository):
|
||||
return PlaintextKey(repository)
|
||||
|
||||
|
||||
def test_format_parse_roundtrip(key):
|
||||
repo_objs = RepoObj(key)
|
||||
data = b"foobar" * 10
|
||||
id = repo_objs.id_hash(data)
|
||||
meta = {"custom": "something"} # size and csize are computed automatically
|
||||
cdata = repo_objs.format(id, meta, data)
|
||||
|
||||
got_meta = repo_objs.parse_meta(id, cdata)
|
||||
assert got_meta["size"] == len(data)
|
||||
assert got_meta["csize"] < len(data)
|
||||
assert got_meta["custom"] == "something"
|
||||
|
||||
got_meta, got_data = repo_objs.parse(id, cdata)
|
||||
assert got_meta["size"] == len(data)
|
||||
assert got_meta["csize"] < len(data)
|
||||
assert got_meta["custom"] == "something"
|
||||
assert data == got_data
|
||||
|
||||
edata = repo_objs.extract_crypted_data(cdata)
|
||||
key = repo_objs.key
|
||||
assert edata.startswith(bytes((key.TYPE,)))
|
||||
|
||||
|
||||
def test_format_parse_roundtrip_borg1(key): # legacy
|
||||
repo_objs = RepoObj1(key)
|
||||
data = b"foobar" * 10
|
||||
id = repo_objs.id_hash(data)
|
||||
meta = {} # borg1 does not support this kind of metadata
|
||||
cdata = repo_objs.format(id, meta, data)
|
||||
|
||||
# borg1 does not support separate metadata and borg2 does not invoke parse_meta for borg1 repos
|
||||
|
||||
got_meta, got_data = repo_objs.parse(id, cdata)
|
||||
assert got_meta["size"] == len(data)
|
||||
assert got_meta["csize"] < len(data)
|
||||
assert data == got_data
|
||||
|
||||
edata = repo_objs.extract_crypted_data(cdata)
|
||||
compressor = repo_objs.compressor
|
||||
key = repo_objs.key
|
||||
assert edata.startswith(bytes((key.TYPE, compressor.ID, compressor.level)))
|
||||
|
||||
|
||||
def test_borg1_borg2_transition(key):
|
||||
# borg transfer reads borg 1.x repo objects (without decompressing them),
|
||||
# writes borg 2 repo objects (giving already compressed data to avoid compression).
|
||||
meta = {} # borg1 does not support this kind of metadata
|
||||
data = b"foobar" * 10
|
||||
len_data = len(data)
|
||||
repo_objs1 = RepoObj1(key)
|
||||
id = repo_objs1.id_hash(data)
|
||||
borg1_cdata = repo_objs1.format(id, meta, data)
|
||||
meta1, compr_data1 = repo_objs1.parse(id, borg1_cdata, decompress=False) # borg transfer avoids (de)compression
|
||||
# in borg 1, we can only get this metadata after decrypting the whole chunk (and we do not have "size" here):
|
||||
assert meta1["ctype"] == LZ4.ID # default compression
|
||||
assert meta1["clevel"] == 0xFF # lz4 does not know levels (yet?)
|
||||
assert meta1["csize"] < len_data # lz4 should make it smaller
|
||||
|
||||
repo_objs2 = RepoObj(key)
|
||||
# note: as we did not decompress, we do not have "size" and we need to get it from somewhere else.
|
||||
# here, we just use len_data. for borg transfer, we also know the size from another metadata source.
|
||||
borg2_cdata = repo_objs2.format(
|
||||
id, meta1, compr_data1[2:], compress=False, size=len_data, ctype=meta1["ctype"], clevel=meta1["clevel"]
|
||||
)
|
||||
meta2, data2 = repo_objs2.parse(id, borg2_cdata)
|
||||
assert data2 == data
|
||||
assert meta2["ctype"] == LZ4.ID
|
||||
assert meta2["clevel"] == 0xFF
|
||||
assert meta2["csize"] == meta1["csize"] - 2 # borg2 does not store the type/level bytes there
|
||||
assert meta2["size"] == len_data
|
||||
|
||||
meta2 = repo_objs2.parse_meta(id, borg2_cdata)
|
||||
# now, in borg 2, we have nice and separately decrypted metadata (no need to decrypt the whole chunk):
|
||||
assert meta2["ctype"] == LZ4.ID
|
||||
assert meta2["clevel"] == 0xFF
|
||||
assert meta2["csize"] == meta1["csize"] - 2 # borg2 does not store the type/level bytes there
|
||||
assert meta2["size"] == len_data
|
|
@ -15,6 +15,7 @@ from ..helpers import msgpack
|
|||
from ..locking import Lock, LockFailed
|
||||
from ..remote import RemoteRepository, InvalidRPCMethod, PathNotAllowed, handle_remote_line
|
||||
from ..repository import Repository, LoggedIO, MAGIC, MAX_DATA_SIZE, TAG_DELETE, TAG_PUT2, TAG_PUT, TAG_COMMIT
|
||||
from ..repoobj import RepoObj
|
||||
from . import BaseTestCase
|
||||
from .hashindex import H
|
||||
|
||||
|
@ -22,6 +23,29 @@ from .hashindex import H
|
|||
UNSPECIFIED = object() # for default values where we can't use None
|
||||
|
||||
|
||||
def fchunk(data, meta=b""):
|
||||
# create a raw chunk that has valid RepoObj layout, but does not use encryption or compression.
|
||||
meta_len = RepoObj.meta_len_hdr.pack(len(meta))
|
||||
assert isinstance(data, bytes)
|
||||
chunk = meta_len + meta + data
|
||||
return chunk
|
||||
|
||||
|
||||
def pchunk(chunk):
|
||||
# parse data and meta from a raw chunk made by fchunk
|
||||
meta_len_size = RepoObj.meta_len_hdr.size
|
||||
meta_len = chunk[:meta_len_size]
|
||||
meta_len = RepoObj.meta_len_hdr.unpack(meta_len)[0]
|
||||
meta = chunk[meta_len_size : meta_len_size + meta_len]
|
||||
data = chunk[meta_len_size + meta_len :]
|
||||
return data, meta
|
||||
|
||||
|
||||
def pdchunk(chunk):
|
||||
# parse only data from a raw chunk made by fchunk
|
||||
return pchunk(chunk)[0]
|
||||
|
||||
|
||||
class RepositoryTestCaseBase(BaseTestCase):
|
||||
key_size = 32
|
||||
exclusive = True
|
||||
|
@ -46,12 +70,12 @@ class RepositoryTestCaseBase(BaseTestCase):
|
|||
self.repository = self.open(exclusive=exclusive)
|
||||
|
||||
def add_keys(self):
|
||||
self.repository.put(H(0), b"foo")
|
||||
self.repository.put(H(1), b"bar")
|
||||
self.repository.put(H(3), b"bar")
|
||||
self.repository.put(H(0), fchunk(b"foo"))
|
||||
self.repository.put(H(1), fchunk(b"bar"))
|
||||
self.repository.put(H(3), fchunk(b"bar"))
|
||||
self.repository.commit(compact=False)
|
||||
self.repository.put(H(1), b"bar2")
|
||||
self.repository.put(H(2), b"boo")
|
||||
self.repository.put(H(1), fchunk(b"bar2"))
|
||||
self.repository.put(H(2), fchunk(b"boo"))
|
||||
self.repository.delete(H(3))
|
||||
|
||||
def repo_dump(self, label=None):
|
||||
|
@ -60,7 +84,7 @@ class RepositoryTestCaseBase(BaseTestCase):
|
|||
H_trans[None] = -1 # key == None appears in commits
|
||||
tag_trans = {TAG_PUT2: "put2", TAG_PUT: "put", TAG_DELETE: "del", TAG_COMMIT: "comm"}
|
||||
for segment, fn in self.repository.io.segment_iterator():
|
||||
for tag, key, offset, size in self.repository.io.iter_objects(segment):
|
||||
for tag, key, offset, size, _ in self.repository.io.iter_objects(segment):
|
||||
print("%s%s H(%d) -> %s[%d..+%d]" % (label, tag_trans[tag], H_trans[key], fn, offset, size))
|
||||
print()
|
||||
|
||||
|
@ -68,9 +92,9 @@ class RepositoryTestCaseBase(BaseTestCase):
|
|||
class RepositoryTestCase(RepositoryTestCaseBase):
|
||||
def test1(self):
|
||||
for x in range(100):
|
||||
self.repository.put(H(x), b"SOMEDATA")
|
||||
self.repository.put(H(x), fchunk(b"SOMEDATA"))
|
||||
key50 = H(50)
|
||||
self.assert_equal(self.repository.get(key50), b"SOMEDATA")
|
||||
self.assert_equal(pdchunk(self.repository.get(key50)), b"SOMEDATA")
|
||||
self.repository.delete(key50)
|
||||
self.assert_raises(Repository.ObjectNotFound, lambda: self.repository.get(key50))
|
||||
self.repository.commit(compact=False)
|
||||
|
@ -80,55 +104,66 @@ class RepositoryTestCase(RepositoryTestCaseBase):
|
|||
for x in range(100):
|
||||
if x == 50:
|
||||
continue
|
||||
self.assert_equal(repository2.get(H(x)), b"SOMEDATA")
|
||||
self.assert_equal(pdchunk(repository2.get(H(x))), b"SOMEDATA")
|
||||
|
||||
def test2(self):
|
||||
"""Test multiple sequential transactions"""
|
||||
self.repository.put(H(0), b"foo")
|
||||
self.repository.put(H(1), b"foo")
|
||||
self.repository.put(H(0), fchunk(b"foo"))
|
||||
self.repository.put(H(1), fchunk(b"foo"))
|
||||
self.repository.commit(compact=False)
|
||||
self.repository.delete(H(0))
|
||||
self.repository.put(H(1), b"bar")
|
||||
self.repository.put(H(1), fchunk(b"bar"))
|
||||
self.repository.commit(compact=False)
|
||||
self.assert_equal(self.repository.get(H(1)), b"bar")
|
||||
self.assert_equal(pdchunk(self.repository.get(H(1))), b"bar")
|
||||
|
||||
def test_read_data(self):
|
||||
meta, data = b"meta", b"data"
|
||||
meta_len = RepoObj.meta_len_hdr.pack(len(meta))
|
||||
chunk_complete = meta_len + meta + data
|
||||
chunk_short = meta_len + meta
|
||||
self.repository.put(H(0), chunk_complete)
|
||||
self.repository.commit(compact=False)
|
||||
self.assert_equal(self.repository.get(H(0)), chunk_complete)
|
||||
self.assert_equal(self.repository.get(H(0), read_data=True), chunk_complete)
|
||||
self.assert_equal(self.repository.get(H(0), read_data=False), chunk_short)
|
||||
|
||||
def test_consistency(self):
|
||||
"""Test cache consistency"""
|
||||
self.repository.put(H(0), b"foo")
|
||||
self.assert_equal(self.repository.get(H(0)), b"foo")
|
||||
self.repository.put(H(0), b"foo2")
|
||||
self.assert_equal(self.repository.get(H(0)), b"foo2")
|
||||
self.repository.put(H(0), b"bar")
|
||||
self.assert_equal(self.repository.get(H(0)), b"bar")
|
||||
self.repository.put(H(0), fchunk(b"foo"))
|
||||
self.assert_equal(pdchunk(self.repository.get(H(0))), b"foo")
|
||||
self.repository.put(H(0), fchunk(b"foo2"))
|
||||
self.assert_equal(pdchunk(self.repository.get(H(0))), b"foo2")
|
||||
self.repository.put(H(0), fchunk(b"bar"))
|
||||
self.assert_equal(pdchunk(self.repository.get(H(0))), b"bar")
|
||||
self.repository.delete(H(0))
|
||||
self.assert_raises(Repository.ObjectNotFound, lambda: self.repository.get(H(0)))
|
||||
|
||||
def test_consistency2(self):
|
||||
"""Test cache consistency2"""
|
||||
self.repository.put(H(0), b"foo")
|
||||
self.assert_equal(self.repository.get(H(0)), b"foo")
|
||||
self.repository.put(H(0), fchunk(b"foo"))
|
||||
self.assert_equal(pdchunk(self.repository.get(H(0))), b"foo")
|
||||
self.repository.commit(compact=False)
|
||||
self.repository.put(H(0), b"foo2")
|
||||
self.assert_equal(self.repository.get(H(0)), b"foo2")
|
||||
self.repository.put(H(0), fchunk(b"foo2"))
|
||||
self.assert_equal(pdchunk(self.repository.get(H(0))), b"foo2")
|
||||
self.repository.rollback()
|
||||
self.assert_equal(self.repository.get(H(0)), b"foo")
|
||||
self.assert_equal(pdchunk(self.repository.get(H(0))), b"foo")
|
||||
|
||||
def test_overwrite_in_same_transaction(self):
|
||||
"""Test cache consistency2"""
|
||||
self.repository.put(H(0), b"foo")
|
||||
self.repository.put(H(0), b"foo2")
|
||||
self.repository.put(H(0), fchunk(b"foo"))
|
||||
self.repository.put(H(0), fchunk(b"foo2"))
|
||||
self.repository.commit(compact=False)
|
||||
self.assert_equal(self.repository.get(H(0)), b"foo2")
|
||||
self.assert_equal(pdchunk(self.repository.get(H(0))), b"foo2")
|
||||
|
||||
def test_single_kind_transactions(self):
|
||||
# put
|
||||
self.repository.put(H(0), b"foo")
|
||||
self.repository.put(H(0), fchunk(b"foo"))
|
||||
self.repository.commit(compact=False)
|
||||
self.repository.close()
|
||||
# replace
|
||||
self.repository = self.open()
|
||||
with self.repository:
|
||||
self.repository.put(H(0), b"bar")
|
||||
self.repository.put(H(0), fchunk(b"bar"))
|
||||
self.repository.commit(compact=False)
|
||||
# delete
|
||||
self.repository = self.open()
|
||||
|
@ -138,7 +173,7 @@ class RepositoryTestCase(RepositoryTestCaseBase):
|
|||
|
||||
def test_list(self):
|
||||
for x in range(100):
|
||||
self.repository.put(H(x), b"SOMEDATA")
|
||||
self.repository.put(H(x), fchunk(b"SOMEDATA"))
|
||||
self.repository.commit(compact=False)
|
||||
all = self.repository.list()
|
||||
self.assert_equal(len(all), 100)
|
||||
|
@ -152,7 +187,7 @@ class RepositoryTestCase(RepositoryTestCaseBase):
|
|||
|
||||
def test_scan(self):
|
||||
for x in range(100):
|
||||
self.repository.put(H(x), b"SOMEDATA")
|
||||
self.repository.put(H(x), fchunk(b"SOMEDATA"))
|
||||
self.repository.commit(compact=False)
|
||||
all = self.repository.scan()
|
||||
assert len(all) == 100
|
||||
|
@ -168,14 +203,14 @@ class RepositoryTestCase(RepositoryTestCaseBase):
|
|||
assert all[x] == H(x)
|
||||
|
||||
def test_max_data_size(self):
|
||||
max_data = b"x" * MAX_DATA_SIZE
|
||||
self.repository.put(H(0), max_data)
|
||||
self.assert_equal(self.repository.get(H(0)), max_data)
|
||||
self.assert_raises(IntegrityError, lambda: self.repository.put(H(1), max_data + b"x"))
|
||||
max_data = b"x" * (MAX_DATA_SIZE - RepoObj.meta_len_hdr.size)
|
||||
self.repository.put(H(0), fchunk(max_data))
|
||||
self.assert_equal(pdchunk(self.repository.get(H(0))), max_data)
|
||||
self.assert_raises(IntegrityError, lambda: self.repository.put(H(1), fchunk(max_data + b"x")))
|
||||
|
||||
def test_set_flags(self):
|
||||
id = H(0)
|
||||
self.repository.put(id, b"")
|
||||
self.repository.put(id, fchunk(b""))
|
||||
self.assert_equal(self.repository.flags(id), 0x00000000) # init == all zero
|
||||
self.repository.flags(id, mask=0x00000001, value=0x00000001)
|
||||
self.assert_equal(self.repository.flags(id), 0x00000001)
|
||||
|
@ -188,7 +223,7 @@ class RepositoryTestCase(RepositoryTestCaseBase):
|
|||
|
||||
def test_get_flags(self):
|
||||
id = H(0)
|
||||
self.repository.put(id, b"")
|
||||
self.repository.put(id, fchunk(b""))
|
||||
self.assert_equal(self.repository.flags(id), 0x00000000) # init == all zero
|
||||
self.repository.flags(id, mask=0xC0000003, value=0x80000001)
|
||||
self.assert_equal(self.repository.flags(id, mask=0x00000001), 0x00000001)
|
||||
|
@ -199,7 +234,7 @@ class RepositoryTestCase(RepositoryTestCaseBase):
|
|||
def test_flags_many(self):
|
||||
ids_flagged = [H(0), H(1)]
|
||||
ids_default_flags = [H(2), H(3)]
|
||||
[self.repository.put(id, b"") for id in ids_flagged + ids_default_flags]
|
||||
[self.repository.put(id, fchunk(b"")) for id in ids_flagged + ids_default_flags]
|
||||
self.repository.flags_many(ids_flagged, mask=0xFFFFFFFF, value=0xDEADBEEF)
|
||||
self.assert_equal(list(self.repository.flags_many(ids_default_flags)), [0x00000000, 0x00000000])
|
||||
self.assert_equal(list(self.repository.flags_many(ids_flagged)), [0xDEADBEEF, 0xDEADBEEF])
|
||||
|
@ -207,8 +242,8 @@ class RepositoryTestCase(RepositoryTestCaseBase):
|
|||
self.assert_equal(list(self.repository.flags_many(ids_flagged, mask=0x0000FFFF)), [0x0000BEEF, 0x0000BEEF])
|
||||
|
||||
def test_flags_persistence(self):
|
||||
self.repository.put(H(0), b"default")
|
||||
self.repository.put(H(1), b"one one zero")
|
||||
self.repository.put(H(0), fchunk(b"default"))
|
||||
self.repository.put(H(1), fchunk(b"one one zero"))
|
||||
# we do not set flags for H(0), so we can later check their default state.
|
||||
self.repository.flags(H(1), mask=0x00000007, value=0x00000006)
|
||||
self.repository.commit(compact=False)
|
||||
|
@ -227,38 +262,39 @@ class LocalRepositoryTestCase(RepositoryTestCaseBase):
|
|||
|
||||
def _assert_sparse(self):
|
||||
# The superseded 123456... PUT
|
||||
assert self.repository.compact[0] == 41 + 8 + 9
|
||||
assert self.repository.compact[0] == 41 + 8 + len(fchunk(b"123456789"))
|
||||
# a COMMIT
|
||||
assert self.repository.compact[1] == 9
|
||||
# The DELETE issued by the superseding PUT (or issued directly)
|
||||
assert self.repository.compact[2] == 41
|
||||
self.repository._rebuild_sparse(0)
|
||||
assert self.repository.compact[0] == 41 + 8 + 9
|
||||
assert self.repository.compact[0] == 41 + 8 + len(fchunk(b"123456789")) # 9 is chunk or commit?
|
||||
|
||||
def test_sparse1(self):
|
||||
self.repository.put(H(0), b"foo")
|
||||
self.repository.put(H(1), b"123456789")
|
||||
self.repository.put(H(0), fchunk(b"foo"))
|
||||
self.repository.put(H(1), fchunk(b"123456789"))
|
||||
self.repository.commit(compact=False)
|
||||
self.repository.put(H(1), b"bar")
|
||||
self.repository.put(H(1), fchunk(b"bar"))
|
||||
self._assert_sparse()
|
||||
|
||||
def test_sparse2(self):
|
||||
self.repository.put(H(0), b"foo")
|
||||
self.repository.put(H(1), b"123456789")
|
||||
self.repository.put(H(0), fchunk(b"foo"))
|
||||
self.repository.put(H(1), fchunk(b"123456789"))
|
||||
self.repository.commit(compact=False)
|
||||
self.repository.delete(H(1))
|
||||
self._assert_sparse()
|
||||
|
||||
def test_sparse_delete(self):
|
||||
self.repository.put(H(0), b"1245")
|
||||
ch0 = fchunk(b"1245")
|
||||
self.repository.put(H(0), ch0)
|
||||
self.repository.delete(H(0))
|
||||
self.repository.io._write_fd.sync()
|
||||
|
||||
# The on-line tracking works on a per-object basis...
|
||||
assert self.repository.compact[0] == 41 + 8 + 41 + 4
|
||||
assert self.repository.compact[0] == 41 + 8 + 41 + len(ch0)
|
||||
self.repository._rebuild_sparse(0)
|
||||
# ...while _rebuild_sparse can mark whole segments as completely sparse (which then includes the segment magic)
|
||||
assert self.repository.compact[0] == 41 + 8 + 41 + 4 + len(MAGIC)
|
||||
assert self.repository.compact[0] == 41 + 8 + 41 + len(ch0) + len(MAGIC)
|
||||
|
||||
self.repository.commit(compact=True)
|
||||
assert 0 not in [segment for segment, _ in self.repository.io.segment_iterator()]
|
||||
|
@ -266,7 +302,7 @@ class LocalRepositoryTestCase(RepositoryTestCaseBase):
|
|||
def test_uncommitted_garbage(self):
|
||||
# uncommitted garbage should be no problem, it is cleaned up automatically.
|
||||
# we just have to be careful with invalidation of cached FDs in LoggedIO.
|
||||
self.repository.put(H(0), b"foo")
|
||||
self.repository.put(H(0), fchunk(b"foo"))
|
||||
self.repository.commit(compact=False)
|
||||
# write some crap to a uncommitted segment file
|
||||
last_segment = self.repository.io.get_latest_segment()
|
||||
|
@ -276,7 +312,7 @@ class LocalRepositoryTestCase(RepositoryTestCaseBase):
|
|||
# usually, opening the repo and starting a transaction should trigger a cleanup.
|
||||
self.repository = self.open()
|
||||
with self.repository:
|
||||
self.repository.put(H(0), b"bar") # this may trigger compact_segments()
|
||||
self.repository.put(H(0), fchunk(b"bar")) # this may trigger compact_segments()
|
||||
self.repository.commit(compact=True)
|
||||
# the point here is that nothing blows up with an exception.
|
||||
|
||||
|
@ -363,8 +399,8 @@ class RepositoryCommitTestCase(RepositoryTestCaseBase):
|
|||
assert not io.is_committed_segment(io.get_latest_segment())
|
||||
|
||||
def test_moved_deletes_are_tracked(self):
|
||||
self.repository.put(H(1), b"1")
|
||||
self.repository.put(H(2), b"2")
|
||||
self.repository.put(H(1), fchunk(b"1"))
|
||||
self.repository.put(H(2), fchunk(b"2"))
|
||||
self.repository.commit(compact=False)
|
||||
self.repo_dump("p1 p2 c")
|
||||
self.repository.delete(H(1))
|
||||
|
@ -372,19 +408,19 @@ class RepositoryCommitTestCase(RepositoryTestCaseBase):
|
|||
self.repo_dump("d1 cc")
|
||||
last_segment = self.repository.io.get_latest_segment() - 1
|
||||
num_deletes = 0
|
||||
for tag, key, offset, size in self.repository.io.iter_objects(last_segment):
|
||||
for tag, key, offset, size, _ in self.repository.io.iter_objects(last_segment):
|
||||
if tag == TAG_DELETE:
|
||||
assert key == H(1)
|
||||
num_deletes += 1
|
||||
assert num_deletes == 1
|
||||
assert last_segment in self.repository.compact
|
||||
self.repository.put(H(3), b"3")
|
||||
self.repository.put(H(3), fchunk(b"3"))
|
||||
self.repository.commit(compact=True)
|
||||
self.repo_dump("p3 cc")
|
||||
assert last_segment not in self.repository.compact
|
||||
assert not self.repository.io.segment_exists(last_segment)
|
||||
for segment, _ in self.repository.io.segment_iterator():
|
||||
for tag, key, offset, size in self.repository.io.iter_objects(segment):
|
||||
for tag, key, offset, size, _ in self.repository.io.iter_objects(segment):
|
||||
assert tag != TAG_DELETE
|
||||
assert key != H(1)
|
||||
# after compaction, there should be no empty shadowed_segments lists left over.
|
||||
|
@ -393,7 +429,7 @@ class RepositoryCommitTestCase(RepositoryTestCaseBase):
|
|||
|
||||
def test_shadowed_entries_are_preserved(self):
|
||||
get_latest_segment = self.repository.io.get_latest_segment
|
||||
self.repository.put(H(1), b"1")
|
||||
self.repository.put(H(1), fchunk(b"1"))
|
||||
# This is the segment with our original PUT of interest
|
||||
put_segment = get_latest_segment()
|
||||
self.repository.commit(compact=False)
|
||||
|
@ -401,7 +437,7 @@ class RepositoryCommitTestCase(RepositoryTestCaseBase):
|
|||
# We now delete H(1), and force this segment to not be compacted, which can happen
|
||||
# if it's not sparse enough (symbolized by H(2) here).
|
||||
self.repository.delete(H(1))
|
||||
self.repository.put(H(2), b"1")
|
||||
self.repository.put(H(2), fchunk(b"1"))
|
||||
delete_segment = get_latest_segment()
|
||||
|
||||
# We pretend these are mostly dense (not sparse) and won't be compacted
|
||||
|
@ -426,7 +462,7 @@ class RepositoryCommitTestCase(RepositoryTestCaseBase):
|
|||
assert H(1) not in self.repository
|
||||
|
||||
def test_shadow_index_rollback(self):
|
||||
self.repository.put(H(1), b"1")
|
||||
self.repository.put(H(1), fchunk(b"1"))
|
||||
self.repository.delete(H(1))
|
||||
assert self.repository.shadow_index[H(1)] == [0]
|
||||
self.repository.commit(compact=True)
|
||||
|
@ -440,7 +476,7 @@ class RepositoryCommitTestCase(RepositoryTestCaseBase):
|
|||
assert self.repository.shadow_index[H(1)] == [4]
|
||||
self.repository.rollback()
|
||||
self.repo_dump("r")
|
||||
self.repository.put(H(2), b"1")
|
||||
self.repository.put(H(2), fchunk(b"1"))
|
||||
# After the rollback segment 4 shouldn't be considered anymore
|
||||
assert self.repository.shadow_index[H(1)] == [] # because the delete is considered unstable
|
||||
|
||||
|
@ -459,19 +495,19 @@ class RepositoryAppendOnlyTestCase(RepositoryTestCaseBase):
|
|||
def segments_in_repository():
|
||||
return len(list(self.repository.io.segment_iterator()))
|
||||
|
||||
self.repository.put(H(0), b"foo")
|
||||
self.repository.put(H(0), fchunk(b"foo"))
|
||||
self.repository.commit(compact=False)
|
||||
|
||||
self.repository.append_only = False
|
||||
assert segments_in_repository() == 2
|
||||
self.repository.put(H(0), b"foo")
|
||||
self.repository.put(H(0), fchunk(b"foo"))
|
||||
self.repository.commit(compact=True)
|
||||
# normal: compact squashes the data together, only one segment
|
||||
assert segments_in_repository() == 2
|
||||
|
||||
self.repository.append_only = True
|
||||
assert segments_in_repository() == 2
|
||||
self.repository.put(H(0), b"foo")
|
||||
self.repository.put(H(0), fchunk(b"foo"))
|
||||
self.repository.commit(compact=False)
|
||||
# append only: does not compact, only new segments written
|
||||
assert segments_in_repository() == 4
|
||||
|
@ -485,7 +521,7 @@ class RepositoryFreeSpaceTestCase(RepositoryTestCaseBase):
|
|||
self.reopen()
|
||||
|
||||
with self.repository:
|
||||
self.repository.put(H(0), b"foobar")
|
||||
self.repository.put(H(0), fchunk(b"foobar"))
|
||||
with pytest.raises(Repository.InsufficientFreeSpaceError):
|
||||
self.repository.commit(compact=False)
|
||||
assert os.path.exists(self.repository.path)
|
||||
|
@ -500,45 +536,52 @@ class RepositoryFreeSpaceTestCase(RepositoryTestCaseBase):
|
|||
class QuotaTestCase(RepositoryTestCaseBase):
|
||||
def test_tracking(self):
|
||||
assert self.repository.storage_quota_use == 0
|
||||
self.repository.put(H(1), bytes(1234))
|
||||
assert self.repository.storage_quota_use == 1234 + 41 + 8
|
||||
self.repository.put(H(2), bytes(5678))
|
||||
assert self.repository.storage_quota_use == 1234 + 5678 + 2 * (41 + 8)
|
||||
ch1 = fchunk(bytes(1234))
|
||||
self.repository.put(H(1), ch1)
|
||||
assert self.repository.storage_quota_use == len(ch1) + 41 + 8
|
||||
ch2 = fchunk(bytes(5678))
|
||||
self.repository.put(H(2), ch2)
|
||||
assert self.repository.storage_quota_use == len(ch1) + len(ch2) + 2 * (41 + 8)
|
||||
self.repository.delete(H(1))
|
||||
assert self.repository.storage_quota_use == 1234 + 5678 + 2 * (41 + 8) # we have not compacted yet
|
||||
assert self.repository.storage_quota_use == len(ch1) + len(ch2) + 2 * (41 + 8) # we have not compacted yet
|
||||
self.repository.commit(compact=False)
|
||||
assert self.repository.storage_quota_use == 1234 + 5678 + 2 * (41 + 8) # we have not compacted yet
|
||||
assert self.repository.storage_quota_use == len(ch1) + len(ch2) + 2 * (41 + 8) # we have not compacted yet
|
||||
self.reopen()
|
||||
with self.repository:
|
||||
# Open new transaction; hints and thus quota data is not loaded unless needed.
|
||||
self.repository.put(H(3), b"")
|
||||
ch3 = fchunk(b"")
|
||||
self.repository.put(H(3), ch3)
|
||||
self.repository.delete(H(3))
|
||||
assert self.repository.storage_quota_use == 1234 + 5678 + 3 * (41 + 8) # we have not compacted yet
|
||||
assert self.repository.storage_quota_use == len(ch1) + len(ch2) + len(ch3) + 3 * (
|
||||
41 + 8
|
||||
) # we have not compacted yet
|
||||
self.repository.commit(compact=True)
|
||||
assert self.repository.storage_quota_use == 5678 + 41 + 8
|
||||
assert self.repository.storage_quota_use == len(ch2) + 41 + 8
|
||||
|
||||
def test_exceed_quota(self):
|
||||
assert self.repository.storage_quota_use == 0
|
||||
self.repository.storage_quota = 80
|
||||
self.repository.put(H(1), b"")
|
||||
assert self.repository.storage_quota_use == 41 + 8
|
||||
ch1 = fchunk(b"x" * 7)
|
||||
self.repository.put(H(1), ch1)
|
||||
assert self.repository.storage_quota_use == len(ch1) + 41 + 8
|
||||
self.repository.commit(compact=False)
|
||||
with pytest.raises(Repository.StorageQuotaExceeded):
|
||||
self.repository.put(H(2), b"")
|
||||
assert self.repository.storage_quota_use == (41 + 8) * 2
|
||||
ch2 = fchunk(b"y" * 13)
|
||||
self.repository.put(H(2), ch2)
|
||||
assert self.repository.storage_quota_use == len(ch1) + len(ch2) + (41 + 8) * 2 # check ch2!?
|
||||
with pytest.raises(Repository.StorageQuotaExceeded):
|
||||
self.repository.commit(compact=False)
|
||||
assert self.repository.storage_quota_use == (41 + 8) * 2
|
||||
assert self.repository.storage_quota_use == len(ch1) + len(ch2) + (41 + 8) * 2 # check ch2!?
|
||||
self.reopen()
|
||||
with self.repository:
|
||||
self.repository.storage_quota = 150
|
||||
# Open new transaction; hints and thus quota data is not loaded unless needed.
|
||||
self.repository.put(H(1), b"")
|
||||
self.repository.put(H(1), ch1)
|
||||
assert (
|
||||
self.repository.storage_quota_use == (41 + 8) * 2
|
||||
self.repository.storage_quota_use == len(ch1) * 2 + (41 + 8) * 2
|
||||
) # we have 2 puts for H(1) here and not yet compacted.
|
||||
self.repository.commit(compact=True)
|
||||
assert self.repository.storage_quota_use == 41 + 8 # now we have compacted.
|
||||
assert self.repository.storage_quota_use == len(ch1) + 41 + 8 # now we have compacted.
|
||||
|
||||
|
||||
class NonceReservation(RepositoryTestCaseBase):
|
||||
|
@ -586,13 +629,13 @@ class NonceReservation(RepositoryTestCaseBase):
|
|||
class RepositoryAuxiliaryCorruptionTestCase(RepositoryTestCaseBase):
|
||||
def setUp(self):
|
||||
super().setUp()
|
||||
self.repository.put(H(0), b"foo")
|
||||
self.repository.put(H(0), fchunk(b"foo"))
|
||||
self.repository.commit(compact=False)
|
||||
self.repository.close()
|
||||
|
||||
def do_commit(self):
|
||||
with self.repository:
|
||||
self.repository.put(H(0), b"fox")
|
||||
self.repository.put(H(0), fchunk(b"fox"))
|
||||
self.repository.commit(compact=False)
|
||||
|
||||
def test_corrupted_hints(self):
|
||||
|
@ -648,7 +691,7 @@ class RepositoryAuxiliaryCorruptionTestCase(RepositoryTestCaseBase):
|
|||
# Data corruption is detected due to mismatching checksums
|
||||
# and fixed by rebuilding the index.
|
||||
assert len(self.repository) == 1
|
||||
assert self.repository.get(H(0)) == b"foo"
|
||||
assert pdchunk(self.repository.get(H(0))) == b"foo"
|
||||
|
||||
def test_index_corrupted_without_integrity(self):
|
||||
self._corrupt_index()
|
||||
|
@ -684,17 +727,17 @@ class RepositoryAuxiliaryCorruptionTestCase(RepositoryTestCaseBase):
|
|||
with self.repository:
|
||||
# No issues accessing the repository
|
||||
assert len(self.repository) == 1
|
||||
assert self.repository.get(H(0)) == b"foo"
|
||||
assert pdchunk(self.repository.get(H(0))) == b"foo"
|
||||
|
||||
def _subtly_corrupted_hints_setup(self):
|
||||
with self.repository:
|
||||
self.repository.append_only = True
|
||||
assert len(self.repository) == 1
|
||||
assert self.repository.get(H(0)) == b"foo"
|
||||
self.repository.put(H(1), b"bar")
|
||||
self.repository.put(H(2), b"baz")
|
||||
assert pdchunk(self.repository.get(H(0))) == b"foo"
|
||||
self.repository.put(H(1), fchunk(b"bar"))
|
||||
self.repository.put(H(2), fchunk(b"baz"))
|
||||
self.repository.commit(compact=False)
|
||||
self.repository.put(H(2), b"bazz")
|
||||
self.repository.put(H(2), fchunk(b"bazz"))
|
||||
self.repository.commit(compact=False)
|
||||
|
||||
hints_path = os.path.join(self.repository.path, "hints.5")
|
||||
|
@ -711,14 +754,14 @@ class RepositoryAuxiliaryCorruptionTestCase(RepositoryTestCaseBase):
|
|||
self._subtly_corrupted_hints_setup()
|
||||
with self.repository:
|
||||
self.repository.append_only = False
|
||||
self.repository.put(H(3), b"1234")
|
||||
self.repository.put(H(3), fchunk(b"1234"))
|
||||
# Do a compaction run. Succeeds, since the failed checksum prompted a rebuild of the index+hints.
|
||||
self.repository.commit(compact=True)
|
||||
|
||||
assert len(self.repository) == 4
|
||||
assert self.repository.get(H(0)) == b"foo"
|
||||
assert self.repository.get(H(1)) == b"bar"
|
||||
assert self.repository.get(H(2)) == b"bazz"
|
||||
assert pdchunk(self.repository.get(H(0))) == b"foo"
|
||||
assert pdchunk(self.repository.get(H(1))) == b"bar"
|
||||
assert pdchunk(self.repository.get(H(2))) == b"bazz"
|
||||
|
||||
def test_subtly_corrupted_hints_without_integrity(self):
|
||||
self._subtly_corrupted_hints_setup()
|
||||
|
@ -726,7 +769,7 @@ class RepositoryAuxiliaryCorruptionTestCase(RepositoryTestCaseBase):
|
|||
os.unlink(integrity_path)
|
||||
with self.repository:
|
||||
self.repository.append_only = False
|
||||
self.repository.put(H(3), b"1234")
|
||||
self.repository.put(H(3), fchunk(b"1234"))
|
||||
# Do a compaction run. Fails, since the corrupted refcount was not detected and leads to an assertion failure.
|
||||
with pytest.raises(AssertionError) as exc_info:
|
||||
self.repository.commit(compact=True)
|
||||
|
@ -748,12 +791,12 @@ class RepositoryCheckTestCase(RepositoryTestCaseBase):
|
|||
|
||||
def get_objects(self, *ids):
|
||||
for id_ in ids:
|
||||
self.repository.get(H(id_))
|
||||
pdchunk(self.repository.get(H(id_)))
|
||||
|
||||
def add_objects(self, segments):
|
||||
for ids in segments:
|
||||
for id_ in ids:
|
||||
self.repository.put(H(id_), b"data")
|
||||
self.repository.put(H(id_), fchunk(b"data"))
|
||||
self.repository.commit(compact=False)
|
||||
|
||||
def get_head(self):
|
||||
|
@ -859,8 +902,8 @@ class RepositoryCheckTestCase(RepositoryTestCaseBase):
|
|||
self.assert_equal({1, 2, 3, 4, 5, 6}, self.list_objects())
|
||||
|
||||
def test_crash_before_compact(self):
|
||||
self.repository.put(H(0), b"data")
|
||||
self.repository.put(H(0), b"data2")
|
||||
self.repository.put(H(0), fchunk(b"data"))
|
||||
self.repository.put(H(0), fchunk(b"data2"))
|
||||
# Simulate a crash before compact
|
||||
with patch.object(Repository, "compact_segments") as compact:
|
||||
self.repository.commit(compact=True)
|
||||
|
@ -868,12 +911,12 @@ class RepositoryCheckTestCase(RepositoryTestCaseBase):
|
|||
self.reopen()
|
||||
with self.repository:
|
||||
self.check(repair=True)
|
||||
self.assert_equal(self.repository.get(H(0)), b"data2")
|
||||
self.assert_equal(pdchunk(self.repository.get(H(0))), b"data2")
|
||||
|
||||
|
||||
class RepositoryHintsTestCase(RepositoryTestCaseBase):
|
||||
def test_hints_persistence(self):
|
||||
self.repository.put(H(0), b"data")
|
||||
self.repository.put(H(0), fchunk(b"data"))
|
||||
self.repository.delete(H(0))
|
||||
self.repository.commit(compact=False)
|
||||
shadow_index_expected = self.repository.shadow_index
|
||||
|
@ -884,7 +927,7 @@ class RepositoryHintsTestCase(RepositoryTestCaseBase):
|
|||
self.reopen()
|
||||
with self.repository:
|
||||
# see also do_compact()
|
||||
self.repository.put(H(42), b"foobar") # this will call prepare_txn() and load the hints data
|
||||
self.repository.put(H(42), fchunk(b"foobar")) # this will call prepare_txn() and load the hints data
|
||||
# check if hints persistence worked:
|
||||
self.assert_equal(shadow_index_expected, self.repository.shadow_index)
|
||||
self.assert_equal(compact_expected, self.repository.compact)
|
||||
|
@ -892,7 +935,7 @@ class RepositoryHintsTestCase(RepositoryTestCaseBase):
|
|||
self.assert_equal(segments_expected, self.repository.segments)
|
||||
|
||||
def test_hints_behaviour(self):
|
||||
self.repository.put(H(0), b"data")
|
||||
self.repository.put(H(0), fchunk(b"data"))
|
||||
self.assert_equal(self.repository.shadow_index, {})
|
||||
assert len(self.repository.compact) == 0
|
||||
self.repository.delete(H(0))
|
||||
|
@ -901,7 +944,7 @@ class RepositoryHintsTestCase(RepositoryTestCaseBase):
|
|||
self.assert_in(H(0), self.repository.shadow_index)
|
||||
self.assert_equal(len(self.repository.shadow_index[H(0)]), 1)
|
||||
self.assert_in(0, self.repository.compact) # segment 0 can be compacted
|
||||
self.repository.put(H(42), b"foobar") # see also do_compact()
|
||||
self.repository.put(H(42), fchunk(b"foobar")) # see also do_compact()
|
||||
self.repository.commit(compact=True, threshold=0.0) # compact completely!
|
||||
# nothing to compact any more! no info left about stuff that does not exist any more:
|
||||
self.assert_not_in(H(0), self.repository.shadow_index)
|
||||
|
@ -1041,13 +1084,13 @@ class RemoteLegacyFree(RepositoryTestCaseBase):
|
|||
|
||||
def test_legacy_free(self):
|
||||
# put
|
||||
self.repository.put(H(0), b"foo")
|
||||
self.repository.put(H(0), fchunk(b"foo"))
|
||||
self.repository.commit(compact=False)
|
||||
self.repository.close()
|
||||
# replace
|
||||
self.repository = self.open()
|
||||
with self.repository:
|
||||
self.repository.put(H(0), b"bar")
|
||||
self.repository.put(H(0), fchunk(b"bar"))
|
||||
self.repository.commit(compact=False)
|
||||
# delete
|
||||
self.repository = self.open()
|
||||
|
|
|
@ -19,8 +19,8 @@ class UpgraderNoOp:
|
|||
def upgrade_item(self, *, item):
|
||||
return item
|
||||
|
||||
def upgrade_compressed_chunk(self, *, chunk):
|
||||
return chunk
|
||||
def upgrade_compressed_chunk(self, meta, data):
|
||||
return meta, data
|
||||
|
||||
def upgrade_archive_metadata(self, *, metadata):
|
||||
new_metadata = {}
|
||||
|
@ -98,33 +98,36 @@ class UpgraderFrom12To20:
|
|||
assert all(key in new_item for key in REQUIRED_ITEM_KEYS)
|
||||
return new_item
|
||||
|
||||
def upgrade_compressed_chunk(self, *, chunk):
|
||||
def upgrade_zlib_and_level(chunk):
|
||||
if ZLIB_legacy.detect(chunk):
|
||||
def upgrade_compressed_chunk(self, meta, data):
|
||||
# meta/data was parsed via RepoObj1.parse, which returns data **including** the ctype/clevel bytes prefixed
|
||||
def upgrade_zlib_and_level(meta, data):
|
||||
if ZLIB_legacy.detect(data):
|
||||
ctype = ZLIB.ID
|
||||
chunk = ctype + level + bytes(chunk) # get rid of the legacy: prepend separate type/level bytes
|
||||
data = bytes(data) # ZLIB_legacy has no ctype/clevel prefix
|
||||
else:
|
||||
ctype = bytes(chunk[0:1])
|
||||
chunk = ctype + level + bytes(chunk[2:]) # keep type same, but set level
|
||||
return chunk
|
||||
ctype = data[0]
|
||||
data = bytes(data[2:]) # strip ctype/clevel bytes
|
||||
meta["ctype"] = ctype
|
||||
meta["clevel"] = level
|
||||
meta["csize"] = len(data) # we may have stripped some prefixed ctype/clevel bytes
|
||||
return meta, data
|
||||
|
||||
ctype = chunk[0:1]
|
||||
level = b"\xFF" # FF means unknown compression level
|
||||
ctype = data[0]
|
||||
level = 0xFF # means unknown compression level
|
||||
|
||||
if ctype == ObfuscateSize.ID:
|
||||
# in older borg, we used unusual byte order
|
||||
old_header_fmt = Struct(">I")
|
||||
new_header_fmt = ObfuscateSize.header_fmt
|
||||
length = ObfuscateSize.header_len
|
||||
size_bytes = chunk[2 : 2 + length]
|
||||
size = old_header_fmt.unpack(size_bytes)
|
||||
size_bytes = new_header_fmt.pack(size)
|
||||
compressed = chunk[2 + length :]
|
||||
compressed = upgrade_zlib_and_level(compressed)
|
||||
chunk = ctype + level + size_bytes + compressed
|
||||
borg1_header_fmt = Struct(">I")
|
||||
hlen = borg1_header_fmt.size
|
||||
csize_bytes = data[2 : 2 + hlen]
|
||||
csize = borg1_header_fmt.unpack(csize_bytes)
|
||||
compressed = data[2 + hlen : 2 + hlen + csize]
|
||||
meta, compressed = upgrade_zlib_and_level(meta, compressed)
|
||||
osize = len(data) - 2 - hlen - csize # amount of 0x00 bytes appended for obfuscation
|
||||
data = compressed + bytes(osize)
|
||||
else:
|
||||
chunk = upgrade_zlib_and_level(chunk)
|
||||
return chunk
|
||||
meta, data = upgrade_zlib_and_level(meta, data)
|
||||
return meta, data
|
||||
|
||||
def upgrade_archive_metadata(self, *, metadata):
|
||||
new_metadata = {}
|
||||
|
|
Loading…
Reference in New Issue