mirror/borg - Z·K·N·T GIT

Commit Graph

Author	SHA1	Message	Date
Thomas Waldmann	bf9f42320e	repository: sync write file in get_fd this fixes a strange test failure that did not happen until now: it could not read the MAGIC bytes from a (quite new) segment file, it just returned the empty string. maybe its appearance is related to the removed I/O calls.	2022-06-14 14:48:56 +02:00
Thomas Waldmann	3ce3fbcdff	repository index: add payload size (==csize) and flags to NSIndex entries This saves some segment file random IO that was previously necessary just to determine the size of to be deleted data. Keep old one as NSIndex1 for old borg compatibility. Choose NSIndex or NSIndex1 based on repo index layout from HashHeader. for an old repo index repo.get(key) returns segment, offset, None, None	2022-06-14 14:48:56 +02:00
Thomas Waldmann	ba1f8926cc	secure_erase: avoid collateral damage, fixes #6768 if a hardlink copy of a repo was made and a new repo config shall be saved, do NOT fill in random garbage before deleting the previous repo config, because that would damage the hardlink copy.	2022-06-13 15:57:01 +02:00
Thomas Waldmann	8e87f1111b	cleanup msgpack related str/bytes mess, fixes #968 see ticket and borg.helpers.msgpack docstring. this changeset implements the full migration to msgpack 2.0 spec (use_bin_type=True, raw=False). still needed compat to the past is done via want_bytes decoder in borg.item.	2022-06-09 17:57:28 +02:00
TW	7b08222256	Merge pull request #6722 from ThomasWaldmann/debug-get-chunk-1.2 borg debug dump-repo-objs --ghost: new --segment=S --offset=O options	2022-05-28 01:42:26 +02:00
Thomas Waldmann	1aba534c5a	better error msg for defect or unsupported repo configs, fixes #6566	2022-04-18 09:27:26 +02:00
TW	8d3db4637d	Merge pull request #6564 from ThomasWaldmann/deleted-key-master load_key: no key is same as empty key, fixes #6441	2022-04-12 19:06:12 +02:00
Jakub Wilk	3a5c79e881	remove stray punctuation from secure-erase message	2022-04-11 18:47:59 +02:00
Thomas Waldmann	f5cddf0224	load_key: no key is same as empty key, fixes #6441 when migrating from repokey to keyfile, we just store an empty key into the repo config, because we do not have a "delete key" RPC api. thus, empty key means "there is no key". here we fix load_key, so that it does not behave differently for no key and empty key: in both cases, it just returns an empty value. additionally, we strip the value we get from the config, so whitespace does not matter. All callers now check for the repokey not being empty, otherwise RepoKeyNotFoundError is raised.	2022-04-10 20:58:59 +02:00
Thomas Waldmann	38f390ae45	repository: create and use version 2 repos only for now for now, this code shall only work on v2 repos (created by this code). the code to read v1 repos is still present though, so for experiments, it is possible to change the repo version in the repo config from 1 to 2 manually. having version 2 in the repo config also avoids that borg < 1.3 is used on such a repo, which would cause damage: old borg would not recognize the PUT2 tagged segment entries and old borg check --repair would likely kill them all due to that. also: keep repo version in Repository.version	2022-04-09 18:58:47 +02:00
Thomas Waldmann	52f75d7722	repository: implement PUT2: header crc32, overall xxh64, fixes #1704 note: this required a slight increase of MAX_OBJECT_SIZE so that MAX_DATA_SIZE could stay the same as before. For PUT2, compute the hash over the whole entry (header and content, excluding hash and crc32 fields, because the crc32 computation includes the hash). Also: refactor crc32 checks into function, use f-strings, structure _read in a more logical sequential order. write_put: avoid creating a large temporary bytes object why use xxh64? - fast even without hw acceleration - borg depends on it already anyway - stronger than crc32 and strong enough for this purpose	2022-04-09 18:58:47 +02:00
Thomas Waldmann	c7b1cd56d8	upgrade: remove the "attic backup" repo upgrader and tests attic is borg's parent project, but it stalled in 2015 and was not updated since then. guess we can assume that most attic users have meanwhile noticed this and already converted their repos to borg. if some did not yet, they are advised to use borg < 1.3 to do that ASAP. note: borg can still DETECT an attic repo by recognizing its ATTIC_MAGIC value and then gives exactly that advice.	2022-04-01 12:41:11 +02:00
Thomas Waldmann	cfa34bdf71	repository: simplify LoggedIO._read Code gets simpler if we always only use the (shorter) header_fmt. That format ALWAYS applies, to all tags borg writes. If the tag unpacked from there indicates that there is also a chunkid to read (like for PUT and DEL), we can decide that inside _read and then read the chunkid from the fd.	2022-03-31 20:50:55 +02:00
Thomas Waldmann	cc3b5c062c	remove algorithms package, move checksums module to borg package	2022-03-17 00:24:49 +01:00
Thomas Waldmann	2e536bcbe2	borg key change-location	2022-03-11 23:05:32 +01:00
Thomas Waldmann	cbeef56454	pyupgrade --py38-plus ./*/.py	2022-02-27 20:11:56 +01:00
Thomas Waldmann	b292e158a6	rename truncate_and_unlink to safe_unlink it usually does not truncate any more, only under "disk full" circumstances and only if there is only one hardlink.	2022-02-15 21:08:34 +01:00
Thomas Waldmann	17e8aef394	compact: not "freeable", but "maybe freeable" e.g. if there is a ton of DELs in a segment, they all are maybe freeable, but only if we also got rid of the respective PUTs (see also #6289).	2022-02-12 20:37:28 +01:00
Thomas Waldmann	e80b5c2272	compact: derive freed space from quota use before/after, fixes #5679 due to the way quota accounting is done, this is likely not 100% precise, but much better than selling the hints as the truth.	2022-02-12 20:37:18 +01:00
Thomas Waldmann	925daf30b7	fix intermediate commits, shall be at end of segment compact_segments produced separate 17b files for intermediate commits, although they were intended to be end-of-segment-file commits. this is because when the intermediate commit is triggered, we are already at an offset beyond the limit. thus needed to add the no_new flag to indicate that we do not want a new segment file just for the commit IF it is an intermediate commit.	2022-02-01 19:45:29 +01:00
Thomas Waldmann	57e0724108	repository: fix compactable space computation for empty segment file	2022-01-22 01:32:04 +01:00
Thomas Waldmann	f4b9f63856	repository: fix used quota computation storage_quota_use should reflect current disk space usage (not considering some overheads like for the index etc.). if a chunk is deleted, but the segment file containing the chunk is not yet compacted, the chunk's disk space is still in use! when compact_segments is dropping the unused chunks, it is the right time to reduce storage_quota_use. storage_quota_use includes the put header overhead.	2022-01-22 01:27:23 +01:00
Peter Gerber	6c21404143	Validate tag ID when --repair[ing] an object This too should make the scan faster as, assuming the data is random, we can skip CRC checks for almost 94% of the incorrect header location solely based on the tag. As draw back, this will limit the number of tags that can be added without breaking backwards compatibility to 16, with 13 currently unused.	2021-10-28 14:13:37 +00:00
Peter Gerber	2bc91e5010	Speed up search for next valid object in segment in --repair mode When an object is corrupted, the start position of the next object will not be known as the size field belonging to the corrupted object may be corrupted as well. In order to find the next object within the segment, the remainder is scanned for the next valid object, byte-by-byte. An object is considered valid if the CRC checksum matches the content. However, doing so the scan accepted any object size that fit within the remainder of the segment. As a result, in particular when the corruption occurred near the start of a segment, CRC checksums were calculated for large objects, often hundreds of megabytes in size, despite the size being limited to 20 MiB. This change makes it so that CRC calculation is skipped when the object header indicates an impossible size, thereby, greatly reducing the number of CPU cycles used for CRC calculations. In my case, this brought down the time for repair from hours to mere minutes. This has also the additional benefit that there is some verification in addition to the CRC checksum. The 4-bytes checksum is rather short considering the amount of data that might be in an archive. Likely fixes the hanging --repair in #5995 also.	2021-10-28 10:59:11 +00:00
Thomas Waldmann	d44836a865	config: accept non-int value for max_segment_size borg config REPO max_segment_size 500M note: when setting a non-int value for this in a repo config, using the repo will require borg >= 1.1.16.	2021-02-28 22:28:58 +01:00
Thomas Waldmann	99aa15b850	config: accept non-int value for storage_quota borg config REPO storage_quota 100G note: when setting a non-int value for this in a repo config, using the repo will require borg >= 1.1.16.	2021-02-28 22:27:48 +01:00
Thomas Waldmann	3d0c61a184	revert incorrect fix for put updating shadow_index, fixes #5661 A) the compaction code needs the shadow index only for this case: segment A: PUT x, segment B: DEL x, with A < B (DEL shadows the PUT). B) for the following case, we have no shadowing DEL (or rather: it does not matter, because there is a PUT right after the DEL) and x is in the repo index, thus the shadow_index is not needed for the special case in the compaction code: segment A: PUT x, segment B: DEL x PUT x see also PR #5636. reverts `f079a83fed` and clarifies the code by more comments. we keep the code deduplication of `5f32b5666a` and just add a update_shadow_index param to make it not look like there was something accidentally forgotten, which was the whole reason for the reverted "fix".	2021-02-04 02:29:43 +01:00
Thomas Waldmann	f079a83fed	fix updating shadow_index also in put The shadow_index should be in same state after both of these sequences (let's assume that A is not in repo yet for simplicity, but it does not matter): a) explicit delete: put(A), delete(A), put(A), resulting in: PUT A, DEL A, PUT A repo contents b) implicit delete: put(A), put(A), resulting in: PUT A, DEL A, PUT A repo contents	2021-01-29 17:05:01 +01:00
Thomas Waldmann	5f32b5666a	deduplicate code of put and delete, no functional change	2021-01-29 17:05:01 +01:00
Thomas Waldmann	6f00b025d8	remove empty shadowed_segments lists, fixes #5275 also: - add test for removed empty shadowed_segments list - add some comments - add repo_dump test debug tool	2021-01-29 15:44:49 +01:00
Andrea Gelmini	72e7c46fa7	Fix typos	2021-01-07 17:54:33 +01:00
Thomas Waldmann	f2cb17d66c	check: debug log segment filename	2021-01-03 18:23:52 +01:00
Dan Hipschman	1a94c2e27a	Allow EIO with warning when trying to hardlink	2020-11-01 14:26:56 -08:00
Thomas Waldmann	bf8706b741	fixup: invert nesting of context managers cleaner teardown of contexts: close mmap, close src_fd (reading), close dst_fd (and rename) maybe it was not a real problem to rename a still open-for-reading / mmapped file, but in any case it is cleaner like now.	2020-09-08 18:26:03 +02:00
Thomas Waldmann	b198160257	check --repair: fix potential data loss, fixes #5325 We already have used SaveFile context manager since long at other places. By using it, the original segment file stays in place until recovery of it is completed (writing/syncing into *.tmp). On successful completion, .tmp is renamed over original + dir syncing. If aborted by some exception, including Ctrl-C, the original file is unmodified.	2020-09-08 18:25:36 +02:00
Thomas Waldmann	7bfa766192	persist shadow_index in between borg runs, fixes #4830 in borg 1.1, compact_segments() was always run directly after some repo writing operation (in same borg process). but now, only "borg compact" is used to compact segments and it is a separate borg invocation (new process), so we need to persist the shadow_index so we do not lose that information.	2020-07-28 21:15:56 +02:00
finefoot	e49a17143d	Add option to bypass locking mechanism	2020-04-11 17:04:52 +02:00
Thomas Waldmann	dd7c08ae91	do not emit warning headline, there might be no mismatches to report instead, use a slightly different format for the warnings themselves.	2020-03-09 21:48:46 +01:00
Thomas Waldmann	d124cf0761	check: improve error output for matching index size, see #4829 if the rebuilt index size matched the on-disk index size AND there was a difference in e.g. 1 key, the old code only output the key/value for one index, but not what is present in the other index. we already had better code in the branch for different index sizes, so just use that for both cases. additionally we tell when the index size matches (new) because we also tell if there is a mismatch.	2020-03-09 21:47:03 +01:00
TW	6520fa2bb7	Merge pull request #5009 from ThomasWaldmann/fix-commit-freespace-calc-missing-segment-file-master commit-time free space calc: ignore bad compact map entries, fixes #4796	2020-03-09 16:01:56 +01:00
Thomas Waldmann	d5a1979d87	commit-time free space calc: ignore bad compact map entries, fixes #4796 at least it does not crash now when committing. the question why the compact map points to a missing segment file is not answered yet, there might be another problem...	2020-03-09 00:16:32 +01:00
Thomas Waldmann	2211aaab48	fix crash when upgrading erroneous hints file, fixes #4922 if an old hints file gets converted to the new format and it has entries referring to non-existent segment files, a crash occurred. with this code, the crash is avoided and the erroneous hints entry is removed.	2020-03-09 00:08:39 +01:00
TW	597b09a993	support platforms with no os.link (#4903 ) support platforms with no os.link, fixes #4901 if we don't have os.link, we just extract another copy instead of making a hardlink. for that to work, we need to have (and keep) the chunks list in hardlink_masters.	2020-03-03 23:34:54 -05:00
Thalian	2209f56cd5	Feature/4674 compact threshold (#4798 ) compact: add --threshold option, fixes #4674	2019-10-24 10:12:58 +02:00
Thomas Waldmann	851db7fe21	ignore EACCES (errno 13) when hardlinking, fixes #4730 we create the hardlink to be able to secure erase the old config file. if we can't do that because there is just a problem with hardlinks not working, the old config will be just overwritten normally (not secure erased). the user will get a warning in that case, but other than that, the overall borg operation will succeed. if there is a bigger problem (like a general lack of permissions or a general issue with the underlying fs), subsequent operations will fail.	2019-10-03 15:19:03 +02:00
TW	373bd8abd3	Merge pull request #4696 from jrast/win10 WIP jrast/borg:win10, PR for better review and testing	2019-08-25 22:41:05 +02:00
Jürg Rast	6b426d08d7	Initial work to build and run borg under windows - Created a batch file to build borg on windows - Adjusted setup.py to be runnable on windows and build the windows extension - Extracted the free space check to a function in the platform module - Created the minimal needed (dummy) functions for the windows platform module	2019-08-24 10:17:18 +02:00
Thomas Waldmann	8b49c4d2df	Repository.check_can_create_repository: use stat() to check similar issue as #4695. (cherry picked from commit `4911720faf`)	2019-08-09 15:10:15 +05:30
Thomas Waldmann	bb7a9e6c20	Repository.open: use stat() to check for repo dir, fixes #4695 (cherry picked from commit `ec3fad0f85`)	2019-08-09 15:09:48 +05:30
Thomas Waldmann	8b75dde0fa	compact: log freed space at INFO level note: correctness of value depends on correctness/completeness of repository.compact datastructure.	2019-05-06 22:47:25 +02:00
user062	a83739fda8	give invalid repo error msg if repo config not found, fixes #4411 if the repo config is not there, we definitely have a invalid repo. for other problems (like permission issues), we'll just let it blow up with a traceback, so the user can see what the precise problem is.	2019-04-20 17:36:30 +02:00
Thomas Waldmann	6ae5530507	lrucache: regularly remove old FDs, fixes #4427	2019-03-11 02:38:24 +01:00
TW	d493806e5c	incremental repo check (#4422 ) incremental repo check, fixes #1657	2019-03-10 20:21:22 +01:00
Thomas Waldmann	7ad5290501	redo stale lock handling, fixes #3986 drop BORG_HOSTNAME_IS_UNIQUE (please use BORG_HOST_ID if needed) borg now always assumes it has a unique hostid - either automatically from fqdn plus uuid.getnode() or overridden via BORG_HOST_ID.	2019-03-04 21:07:05 +01:00
Thomas Waldmann	25264dce1f	compact: require >10% freeable space in a segment, fixes #2985 before this, it over-eagerly compacted "small" segments ("small" being < 100MB by default) if there were only a few bytes to be freed. also: - improve debug logging - as compaction is a separate borg command now, use the module logger	2019-02-22 16:18:41 +01:00
Thomas Waldmann	600e798201	borg init --make-parent-dirs parent1/parent2/repo_dir, fixes #4235	2019-02-04 17:12:11 +01:00
TW	422d9cf170	Merge pull request #4275 from ThomasWaldmann/fix-empty-segment-crash-master recover_segment: handle too small segment files correctly, see #4272	2019-02-02 00:06:44 +01:00
Manu	c3a882b509	Use f_frsize instead of f_bsize to calculate free space. Fixes #4289	2019-01-31 14:26:21 +08:00
Thomas Waldmann	2c94d5ba58	recover_segment: handle too small segment files correctly, see #4272 nothing left to recover there, but at least we must not crash in mmap().	2019-01-29 19:21:51 +01:00
TW	2bcff382cb	Merge pull request #4247 from ThomasWaldmann/memoryview-cm correctly release memoryview	2019-01-29 15:53:47 +01:00
Thomas Waldmann	b4c68de128	avoid diaper pattern in configparser by opening files, fixes #4263 this will fail early with correct error msg / exception traceback if a config file is not readable.	2019-01-27 03:28:11 +01:00
Thomas Waldmann	78361744ea	keep "data" as is, use "d" for slices so that the data.release() call is on the original memoryview and also we can delete the last reference to a slice of it first.	2019-01-25 02:09:00 +01:00
Thomas Waldmann	2910d13055	use try/finally to ensure correct memoryview release see #4243.	2019-01-25 02:09:00 +01:00
Thomas Waldmann	02f3daebbe	use a contextmanager to ensure correct memoryview release see #4243.	2019-01-25 02:09:00 +01:00
Emmo Emminghaus	733a2bfa30	Introduce borg.platformflags.is_<os>	2018-11-10 23:34:43 +01:00
Emmo Emminghaus	558ca61d20	remove posix issues and fixup for unsupported methodes	2018-11-10 21:48:46 +01:00
Thomas Waldmann	d6cb39a6d6	implement borg debug dump-repo-objs --ghost intended as a last resort measure to export all segment file contents in a relatively easy to use format. if you want to dig into a damaged repo (e.g. missing segment files, missing commits) and you know what you do. note: dump-repo-objs --ghost must not use repo.list() because this would need the repo index and call get_transaction_id and check_transaction methods, which can easily fail on a damaged repo. thus we use the same low level scan method as we use anyway to get some encrypted piece of data to setup the decryption "key". (cherry picked from commit `8738e85967`)	2018-08-09 08:29:34 +02:00
Thomas Waldmann	3c173cc03b	wrap msgpack, fixes #3632 , fixes #2738 wrap msgpack to avoid future upstream api changes making troubles or that we would have to globally spoil our code with extra params. make sure the packing is always with use_bin_type=False, thus generating "old" msgpack format (as borg always did) from bytes objects. make sure the unpacking is always with raw=True, thus generating bytes objects. note: safe unicode encoding/decoding for some kinds of data types is done in Item class (see item.pyx), so it is enough if we care for bytes objects on the msgpack level. also wrap exception handling, so borg code can catch msgpack specific exceptions even if the upstream msgpack code raises way too generic exceptions typed Exception, TypeError or ValueError. We use own Exception classes for this, upstream classes are deprecated	2018-08-06 17:32:55 +02:00
Thomas Waldmann	3715d2da3e	slightly refactor write_commit using new "want_new" flag	2018-07-14 14:29:28 +02:00
Thomas Waldmann	1f387d911a	start new segment file for put/del to MANIFEST_ID specialcase deleting / writing the manifest to be in a separate, new segment file, so that when we supersede and compact it later, less segment data has to be shuffled around - compaction can then just delete this segment file and that's all.	2018-07-14 14:29:28 +02:00
Thomas Waldmann	755eaeec0a	borg compact --cleanup-commits to get rid of leftover 17byte segments see #2850.	2018-07-14 14:29:28 +02:00
Marian Beermann	aeef082483	repository: track commits in hints	2018-07-14 14:29:28 +02:00
Thomas Waldmann	de4afa097c	separate borg compact command, fixes #2195	2018-07-14 14:29:28 +02:00
Thomas Waldmann	5b5546d7e9	avoid stale filehandle issues, fixes #3265	2018-06-24 01:29:15 +02:00
TW	3e2d5b2b22	Merge pull request #3581 from ThomasWaldmann/borg-config-validation borg config: add some validation, fixes #3566	2018-03-05 23:40:12 +01:00
Thomas Waldmann	0e0e6da585	make sure all segment file offsets fit into uint32, fixes #3592 C code and the repo index use uint32 type for segment file offsets, so when opening a repo and the config max_segment_size is too big, fail early. Also disallow setting a too big value via "borg config".	2018-03-05 17:50:53 +01:00
Thomas Waldmann	fe65ccf95a	be more clear in secure-erase warning message, fixes #3591	2018-03-05 15:21:17 +01:00
Josh Holland	9f400633f2	Correct some confusing error messages from `borg init` (#3485 ) init: more clear exception messages for borg create, fixes #3465 also: refactor	2017-12-29 01:15:07 +01:00
Thomas Waldmann	203a5c8f19	catch ENOTSUP for os.link, fixes #3107	2017-10-10 01:57:58 +02:00
TW	67cb76809a	Merge pull request #2998 from ThomasWaldmann/fix-2994 fix .isoformat() issues	2017-09-07 14:54:46 +02:00
Thomas Waldmann	928bde8676	get rid of datetime.isoformat to avoid bugs like #2994	2017-09-07 14:11:07 +02:00
Thomas Waldmann	7122913825	repo cleanup/write: invalidate cached FDs	2017-09-06 06:11:39 +02:00
TW	86c0b66de3	Merge pull request #2988 from ThomasWaldmann/recover-segments-memory-usage recover_segment: use mmap(), fixes #2982	2017-09-02 17:48:04 +02:00
Thomas Waldmann	9fc4d00bf6	recover_segment: use mmap(), fixes #2982	2017-09-01 05:26:27 +02:00
Thomas Waldmann	57f808e4bb	add debug logging for repository cleanup so we can know whether it did a cleanup and if so, which and how many segments were cleaned up.	2017-08-31 22:49:30 +02:00
enkore	11653d8bc2	Merge pull request #2920 from lfos/detect-attic-repos Detect non-upgraded Attic repositories	2017-08-16 17:47:02 +02:00
Lukas Fleischer	0943b322e3	Detect non-upgraded Attic repositories When opening a repository, always try to read the magic number of the latest segment and compare it to the Attic segment magic (unless the repository is opened for upgrading). If an Attic segment is detected, raise a dedicated exception, telling the user to upgrade the repository first. Fixes #1933.	2017-08-15 19:58:30 +02:00
Thomas Waldmann	6f94949a36	migrate locks to child PID when daemonize is used also: increase platform api version due to change in get_process_id behaviour.	2017-08-08 03:46:44 +02:00
Thomas Waldmann	b7b428edc2	repository: fix assert expression to not have a side effect lgtm: This 'assert' statement contains an expression which may have side effects.	2017-07-22 01:51:19 +02:00
Thomas Waldmann	89f3cab6cd	move get_limited_unpacker to helpers also: move some constants to borg.constants	2017-06-25 23:36:28 +02:00
Marian Beermann	8aa745ddbd	create: --no-cache-sync	2017-06-18 02:01:26 +02:00
Andrea Gelmini	e4247cc0d2	Fix typos	2017-06-09 16:49:30 +02:00
Marian Beermann	1135114520	helpers: truncate_and_unlink doc	2017-06-06 19:52:08 +02:00
Marian Beermann	ed0a5c798f	platform.SaveFile: truncate_and_unlink temporary SaveFile is typically used for small files where this is not necessary. The sole exception is the files cache.	2017-06-06 18:13:20 +02:00
Marian Beermann	95064cd241	repository: truncate segments before unlinking	2017-06-06 17:21:45 +02:00
Marian Beermann	54e023c75a	repository: add complementary index corruption test	2017-06-02 21:44:45 +02:00
Marian Beermann	2e067a7ae8	repository: add refcount corruption test	2017-06-02 21:44:45 +02:00
Marian Beermann	f61ee038d0	repository: checksum index and hints	2017-06-02 21:44:45 +02:00
Marian Beermann	6c91a750d1	algorithms: rename crc32 to checksums	2017-06-01 21:26:42 +02:00
Marian Beermann	4edf77788d	Implement storage quotas	2017-05-31 18:36:03 +02:00

1 2 3 4 5

213 Commits