Commit Graph

403 Commits

Author SHA1 Message Date
Thomas Waldmann 8e87f1111b cleanup msgpack related str/bytes mess, fixes #968
see ticket and borg.helpers.msgpack docstring.

this changeset implements the full migration to
msgpack 2.0 spec (use_bin_type=True, raw=False).

still needed compat to the past is done via want_bytes decoder in borg.item.
2022-06-09 17:57:28 +02:00
Thomas Waldmann f8dbe5b542 cleanup msgpack related str/bytes mess, see #968
see ticket and borg.helpers.msgpack docstring.
2022-06-09 17:57:28 +02:00
Thomas Waldmann 32a3601e4a compute hlid from inode / device 2022-06-09 17:49:16 +02:00
Thomas Waldmann d3dfa3be30 use version 2 for new archives
but still be able to read v1 archives
for borg transfer.
2022-06-09 17:49:16 +02:00
Thomas Waldmann e5f1a4fb4d recreate: cachedir_masters not needed any more
now all hardlinked regular file items have chunks.
2022-05-18 14:20:01 +02:00
Thomas Waldmann 6bfdb3f630 refactor hardlink_master processing globally
borg now has the chunks list in every item with content.
due to the symmetric way how borg now deals with hardlinks using
item.hlid, processing gets much simpler.

but some places where borg deals with other "sources" of hardlinks
still need to do some hardlink management:
borg uses the HardLinkManager there now (which is not much more
than a dict, but keeps documentation at one place and avoids some
code duplication we had before).

item.hlid is computed via hardlink_id function.

support hardlinked symlinks, fixes #2379
as we use item.hlid now to group hardlinks together,
there is no conflict with the item.source usage for
symlink targets any more.

2nd+ hardlinks now add to the files count as did the 1st one.
for borg, now all hardlinks are created equal.
so any hardlink item with chunks now adds to the "file" count.

ItemFormatter: support {hlid} instead of {source} for hardlinks
2022-05-18 14:20:01 +02:00
Thomas Waldmann 98b7dc0bf5 transfer: clean item of attic 0.13 'acl' bug remnants
also: remove attic bug support code from borg check.

borg transfer removes the acl key. we do not run borg check on old repos.
2022-05-18 14:20:00 +02:00
Thomas Waldmann 1c0937958d show_progress: add finished=true/false to archive_progress json, fixes #6570
also:
- remove empty values from final json
- add test
2022-05-08 18:32:07 +02:00
Thomas Waldmann cc0e33da65 fix key.decrypt calls
the id must now always be given correctly because
the AEAD crypto modes authenticate the chunk id.

the special case when id == MANIFEST_ID is now handled
inside assert_id, so we never need to give a None id.
2022-05-02 20:56:50 +02:00
Thomas Waldmann e199f5bc6c metadata stream can produce all-zero chunks, fixes #6587
all-zero chunks are propagated as:
CH_ALLOC, data=None, size=len(zeros)

other chunks are:
CH_DATA, data=data, size=len(data)

also: remove the comment with the wrong assumption
2022-04-14 00:22:05 +02:00
Thomas Waldmann b5f7f2376c check archives: improve error handling for corrupt archive metadata block
this is similar to #4777.

borg check must not crash if an archive metadata block does not decrypt.

Instead, report the archive_id, remove the archive from the manifest and skip to the next archive.
2022-04-12 17:47:43 +02:00
Thomas Waldmann ced3d8b9d5 check archive: make robust_iterator more robust, fixes #4777
borg check must not crash if an archive metadata chunk does not decrypt.

Instead, report the chunk and skip to the next one.
2022-04-12 17:47:32 +02:00
TW 28fa9e0f0b
Merge pull request #6523 from ThomasWaldmann/pax-borg-item-master
import/export-tar: --tar-format=BORG: roundtrip ALL item metadata
2022-04-09 20:22:36 +02:00
TW 1e213e93a3
Merge pull request #6544 from ThomasWaldmann/fix-progress-archivename-master
escape % chars in archive name, fixes #6500
2022-04-07 20:22:20 +02:00
Thomas Waldmann 911da7a1cf escape % chars in archive name, fixes #6500
also: fix percentage format for float value.
2022-04-07 18:07:50 +02:00
Björn Ketelaars e86fde5364 Fix OpenBSD symlink mode test failure (#2055)
OpenBSD does not have `lchmod()` causing `os.lchmod` to be unavailable
on this platform. As a result ArchiverTestCase::test_basic_functionality
fails when run manually (#2055).

OpenBSD does have `fchmodat()`, which has a flag that makes it behave
like `lchmod()`. In Python this can be used via `os.chmod(path, mode,
follow_symlinks=False)`.

As of Python 3.3 `os.lchmod(path, mode)` is equivalent to
`os.chmod(path, mode, follow_symlinks=False)`. As such, switching to the
latter is preferred as it enables more platforms to do the right thing.
2022-04-04 21:55:48 +02:00
Thomas Waldmann e8069a8f80 import/export-tar: --tar-format=BORG: roundtrip ALL item metadata, fixes #5830
export-tar: just msgpack and b64encode all item metadata and
            put that into a BORG specific PAX header.
            this is *additional* to the standard tar metadata.

import-tar: when detecting the BORG specific PAX header, just get
            all metadata from there (and ignore the standard tar
            metadata).
2022-04-02 22:25:44 +02:00
Thomas Waldmann 78e92fa9e1 import/export-tar: --tar-format, support ctime/atime
--tar-format=GNU|PAX (default: GNU)

changed the tests which use GNU tar cli tool to use --tar-format=GNU
explicitly, so they don't break in case we change the default.

atime timestamp is only present in output if the archive item has it
(which is not the case by default, needs "borg create --atime ...").
2022-04-02 18:30:55 +02:00
Thomas Waldmann d3b78a6cf5 minor key.encrypt api change/cleanup
we already have .decrypt(id, data, ...).
i changed .encrypt(chunk) to .encrypt(id, data).

the old borg crypto won't really need or use the id,
but the new AEAD crypto will authenticate the id in future.
2022-03-26 17:05:57 +01:00
Thomas Waldmann 2bcee08b88 import-tar: fix mtime type bug
looks like with a .tar file created by the tar tool,
tarinfo.mtime is a float [s]. So, after converting to
nanoseconds, we need to cast to int because that's what
Item.mtime wants.

also added a safe_ns() there to clip values to the safe range.
2022-03-05 16:24:59 -05:00
Thomas Waldmann cbeef56454 pyupgrade --py38-plus ./**/*.py 2022-02-27 20:11:56 +01:00
TW 4896fe1560
Merge pull request #6296 from ThomasWaldmann/cache-pre12-archive-meta
info: use a pre12-meta cache to accelerate stats for borg < 1.2 archives
2022-02-14 18:29:17 +01:00
Thomas Waldmann a2fb9cde4e calc_stats progress display: add archive name 2022-02-14 18:00:02 +01:00
Thomas Waldmann 25e27a1539 info: use a pre12-meta cache to accelerate stats for borg < 1.2 archives
first time borg info is invoked on a borg 1.1 repo, it can take
a rather long time computing and caching some stats values for
1.1 archives, which borg 1.2 archives have in their archive
metadata structure. be patient, esp. if you have lots of old
archives.

following invocations are much faster.
2022-02-14 18:00:02 +01:00
Tomás Andrighetti a2ae36bb54 Exclude directories in is_hardlink_master 2022-02-13 19:23:40 -03:00
Thomas Waldmann 5064ec3c9a fix hardlinkable file type check, fixes #6037 2021-11-16 14:36:43 +01:00
Jim Paris 7a0ffed7f0 create: fix passing device nodes and symlinks to --paths-from-stdin
Paths that come from --paths-from-stdin or --paths-from-command don't
have a parent_fd or name, so we need to use the os_stat helper that
falls back on the full path if those are missing.

Fixes borgbackup/borg#6009
2021-10-14 11:46:10 -04:00
Thomas Waldmann 506c01dc8f import-tar: fix empty user/group name in TarInfo, fixes #5853
if the tar has no information about user/group name (empty string),
we must assign None to Item.user/group (not the empty string).
2021-06-17 15:59:41 +02:00
Thomas Waldmann 4572974218 fix missing parameter in "did not consistently fail" msg, see #5822 2021-06-15 23:10:37 -05:00
Thomas Waldmann b0af91837d minor fixes 2021-06-14 16:03:49 +02:00
Thomas Waldmann fb2efd88fe implement TarfileObjectProcessors similar to FilesystemObjectProcessors 2021-06-14 15:37:58 +02:00
Elmar Hoffmann 938e7f295c add progress indicator for archive check
Depending on the number of archives in a repository, the archive check part
of the check operation can take some time, so it should have a progress
indicator as well.
2021-05-15 23:15:31 +02:00
Thomas Waldmann 76dfd64aba create/recreate: print preliminary file status early, fixes #5417
if we back up stdin / pipes / regular files (or devices with --read-special),
that may take longer, depending on the amount of content data (could be many GiBs).

usually borg announces file status AFTER backing up the file,
when the final status of the file is known.

with this change, borg announces a preliminary file status before
the content data is processed. if the file status changes afterwards,
e.g. due to an error, it will also announce that as final file status.
2021-04-30 20:34:13 +02:00
Romain Vimont 9ddcfaf4f7 info / create --stats: add --iec option
If --iec is passed, then sizes are expressed in powers of 1024
instead of 1000.
2021-04-28 15:17:40 +02:00
Thomas Waldmann dec1664a7e missing / healed chunks: always tell chunk ID, fixes #5704 2021-04-19 23:46:21 +02:00
Thomas Waldmann 6f9b9e5a53 s/numeric_owner/numeric_ids/g 2021-04-16 15:02:16 +02:00
Thomas Waldmann bbccdbd81c mount: implement --numeric-owner (default: False!), fixes #2377
this is different default behaviour than in borg < 1.2:

default (numeric_owner=False) is to use the user/group name from the archive,
look up the local uid / gid and then use that for the FUSE fs.

when --numeric-owner is given (numeric_owner=True), then the uid/gid
from the archive is directly used (as it was the default behaviour in
borg < 1.2).

this was implemented like this (changing the default behaviour) to make
borg mount and borg extract behave more similar considering usage of
user/group numeric archived ids or archived names mapped to corresponding
numeric local system ids.

also, both now use the same function to get the uid/gid from the item.

fuse:
- add user and group name entries to default_dir
- also: set internal_dict(!) of new Item with data from Item.as_dict()
2021-03-07 18:16:23 +01:00
Thomas Waldmann 2211b840a3 verbose files cache logging via --debug-topic=files_cache, fixes #5659 2021-02-28 22:39:44 +01:00
Thomas Waldmann d4971e2819 some micro-opts in stat_ext_attrs 2021-02-16 23:24:05 +01:00
Thomas Waldmann 1b65db990d create/extract: add --noxattrs option, #3955
when given with borg create, borg will not get xattrs from input files (and thus, it will not archive xattrs).

when given with borg extract, borg will not read xattrs from archive and it will not set xattrs on extracted files.
2021-02-16 23:20:28 +01:00
Thomas Waldmann 9412a8430e create/extract: add --noacls option, #3955
when given with borg create, borg will not get ACLs from input files (and thus, it will not archive ACLs).

when given with borg extract, borg will not read ACLs from archive and it will not set ACLs on extracted files.
2021-02-16 22:43:08 +01:00
Manu a84ead8e7c Pass args.log_json to FilesystemObjectProcessors/Statistics instance 2021-02-07 10:42:46 +08:00
Thomas Waldmann 6dc334422e fixup: improve comment about assumptions in the item metadata stream chunker 2021-01-15 21:51:15 +01:00
Thomas Waldmann 8162e2e67b cached_hash is only used in archive, move it there 2021-01-14 20:50:12 +01:00
Thomas Waldmann be257728ca move zeros to constants module 2021-01-14 20:02:18 +01:00
Thomas Waldmann 3b9798cffc remove max_chunk_size (unused) 2021-01-14 19:56:39 +01:00
Thomas Waldmann ef19d937ed use cached_hash also to generate all-zero replacement chunks
at least for major amounts of fixed-size replacement hashes,
this will be much faster. also less memory management overhead.
2021-01-08 23:39:53 +01:00
Thomas Waldmann f3088a9893 rename chunk_to_id_data to cached_hash 2021-01-08 23:39:53 +01:00
Thomas Waldmann 92f221075a refactor recreate to use chunk_to_id_data 2021-01-08 23:39:53 +01:00
Thomas Waldmann b3659e0b8c reuse chunker.zeros for sparse extraction 2021-01-08 23:39:53 +01:00
Thomas Waldmann 9fd284ce1a refactor new zero chunk handling to be reusable 2021-01-08 23:39:53 +01:00
Thomas Waldmann 6d0f9a52eb detect all-zero chunks, avoid hashing them
comparing zeros is quicker than hashing them.
the comparison should fail quickly inside non-zero data.
2021-01-08 17:40:06 +01:00
Thomas Waldmann 52bd55b29a integrate Chunk type, avoid hashing holes 2021-01-08 17:39:51 +01:00
Thomas Waldmann b8bb0494f6 create --sparse, file map support for the "fixed" chunker, see #14
a file map can be:

- created internally inside chunkify by calling sparsemap, which uses
  SEEK_DATA / SEEK_HOLE to determine data and hole ranges inside a
  seekable sparse file.
  Usage: borg create --sparse --chunker-params=fixed,BLOCKSIZE ...
  BLOCKSIZE is the chunker blocksize here, not the filesystem blocksize!

- made by some other means and given to the chunkify function.
  this is not used yet, but in future this could be used to only read
  the changed parts and seek over the (known) unchanged parts of a file.

sparsemap: the generate range sizes are multiples of the fs block size.
           the tests assume 4kiB fs block size.
2020-12-27 22:06:08 +01:00
Thomas Waldmann 24d3400dd4 borg export-tar: fix memory leak with ssh: remote repository, fixes #5568
also: added a comment how to avoid this kind of memory leak.
2020-12-17 22:55:13 +01:00
Guinness 9052c1cc54
Add repo location to the stats in borg create 2020-12-16 13:46:29 +01:00
Lapinot 34f6cfcd81
Outsource recursive directory walking (#5492)
Split recursive directory walking/processing into walking and item processing.
2020-11-15 15:31:01 +01:00
Phil Kulin c0504c0669 create: implement --stdin-mode, --stdin-user and --stdin-group, #5333 2020-11-01 20:45:56 +03:00
Thomas Waldmann 0839ac3034 prettier error message when archive gets too big, fixes #5307 2020-09-08 21:00:27 +02:00
Thomas Waldmann d2536de4ee fix hardlinked CACHEDIR.TAG processing, fixes #4911 2020-06-14 22:00:02 +02:00
Thomas Waldmann dee402652f --read-special: .part files also should be regular files, fixes #5217 2020-06-14 15:36:22 +02:00
Peter Gerber 00b09370c0
Allow creating archives using stdout of given command (#5174)
allow creating archives using stdout of given command

In addition to allowing:

some-command --param value | borg create REPO::ARCH -

also allow:

borg create --content-from-command create REPO::ARCH -- some-command --param value

The difference is that the latter approach deals with errors properly.
In the former example, an archive is created no matter what. Even, if
`some-command` aborts and the output is truncated, Borg won't realize.
In the latter example, the status code is checked and archive creation
is aborted properly when appropriate.
2020-06-02 22:24:14 +02:00
Elmar Hoffmann dad3aa9dae rename local preload() function to not overwrite keyword argument of same name
The locally defined preload() function overwrites the preload boolean keyword
argument, always evaluating to true, so preloading is done, even when not
requested by the caller, causing a memory leak.
Also move its definition outside of the loop.

This issue was found by Antonio Larrosa in borg issue #5202.
2020-06-01 17:12:51 +02:00
Thalian 08a7661e67 [FEATURE] #4489 – Deprecate --nobsdflags option
Replaced by --noflags. In internal data structure the key 'bsdflags' is kept for backwards compatibility.
2020-03-25 06:35:15 +01:00
Thomas Waldmann 046dea8643 check: do not stumble over invalid item key, fixes #4845
The code used for error reporting crashes due to an invalid utf-8
sequence. Use errors='replace' to never crash there. Errors
are expected in input data when borg check is run.
2020-03-09 00:12:36 +01:00
TW 597b09a993 support platforms with no os.link (#4903)
support platforms with no os.link, fixes #4901

if we don't have os.link, we just extract another copy instead of making a hardlink.

for that to work, we need to have (and keep) the chunks list in hardlink_masters.
2020-03-03 23:34:54 -05:00
Thomas Waldmann a8831f4978 fix ProgressIndicator msgids, fixes #4935
add some to code, fix docs.
2020-03-03 23:57:36 +01:00
Rémi Oudin a029d686b5 Borg recreate timestamp is a no op (#4815)
recreate: support --timestamp option, fixes #4745
2019-11-16 11:03:34 +01:00
TW aa7df50a2d
Merge pull request #4635 from ThomasWaldmann/ctrlc-checkpoint
first ctrl-c: checkpoint and abort, fixes #4606
2019-09-06 21:44:07 +02:00
Thomas Waldmann cb2d31ed98 fix partial extract for hardlinked contentless file types, fixes #4725
if the file is not a regular file, but a hardlink slave with a not
extracted hardlink master, chunks will be None and we must not call
preload(chunks).

(cherry picked from commit 291d58efa1)
2019-08-27 19:20:20 +05:30
Thomas Waldmann 9732fe4965 special behaviour on first ctrl-c, fixes #4606
like:
 - try saving a checkpoint if borg create is ctrl-c-ed
2019-08-25 22:49:09 +02:00
Jürg Rast bff97a99e1 Windows specific directory handling
On windows os.open does not work for directories.
If borg tries to open an directory on windows, None is returned
as file descriptor. The archive and archiver where adjusted to
handle the case if a file descriptor is None.
2019-08-24 10:17:18 +02:00
Thomas Waldmann 71c7efd17c extract: fix KeyError for "partial" extraction, fixes #4607
note that "partial" even applied to giving an always matching condition.

"full" is only assumed if no conditions are given.
2019-06-10 20:18:44 +02:00
Thomas Waldmann f33f318d81 preload chunks for hardlink slaves w/o preloaded master, fixes #4350
also split the hardlink extraction test into 2 tests.
2019-05-06 02:06:58 +02:00
Thomas Waldmann 502ebe63be delete archive: consider part files correctly for stats, see #4507 2019-04-19 19:29:30 +02:00
Thomas Waldmann cd4f6b41ca create: only run stat_simple_attrs() once
the second call was done in stat_attrs().

this increases backup with lots of unchanged files performance by ~ 5%.
2019-04-08 21:34:09 +02:00
Thomas Waldmann b3751b107d determine whether a file has changed while being backed up, fixes #1750 2019-03-11 22:55:27 +01:00
Thomas Waldmann 6809f6f7fa calc_stats: use archive stats metadata, if available
by default, we still have to compute unique_csize the slow way,
but the code offers want_unique=False param to not compute it.
2019-02-23 15:05:07 +01:00
Thomas Waldmann e569595974 include size/csize/nfiles[_parts] stats into archive, fixes #3241 2019-02-23 15:05:07 +01:00
Thomas Waldmann 23eeded7c5 fix --read-special behaviour: follow symlinks pointing to special files
also: added a test for this.
2019-02-20 10:13:09 +01:00
Thomas Waldmann ec17f0a607 check for stat race conditions, see #908
we must avoid a handler processing a fs item of wrong file type,
so check if it has changed.
2019-02-20 09:16:57 +01:00
Thomas Waldmann 39922e88e5 micro-opt: get xattrs directly before acls
on linux, acls are based on xattrs, so do these closeby:

1. listxattr -> keys (without acl related keys)
2. for all keys: getxattr
3. acl-related getxattr by acl library
2019-02-17 02:46:03 +01:00
Thomas Waldmann 85b711fc88 opening device files is troublesome, don't do it
for fd-based operations, we would have to open the file, but for
char / block devices this has unwanted effects, even if we do not
read from the device.

thus, we use path (or dir_fd + name) based ops here.
2019-02-14 09:20:04 +01:00
Thomas Waldmann 833c49f834 use *at style functions (e.g. openat, statat) to avoid races
races via changing path components can be avoided by opening the
parent directory and using parent_fd + file_name combination with
*at style functions to access the directories' contents.
2019-02-14 09:20:04 +01:00
Thomas Waldmann ad5b9a1dfd _process / process_*: change to kwargs only
we'll add/remove some args soon, so many pos args would be just bad.
2019-02-14 09:20:03 +01:00
Thomas Waldmann 8220c6eac8 move/refactor Archive._open_rb function to helpers.os_open
also:
- add and use OsOpen context manager
- add O_NONBLOCK, O_NOFOLLOW, O_NOCTTY (inspired by gnu tar)
2019-02-14 09:20:03 +01:00
Thomas Waldmann 677102f292 process_file: avoid race condition: stat data vs. content
always open the file and then do all operations with the fd:
- fstat
- read
- get xattrs, acls, bsdflags
2019-02-14 09:20:03 +01:00
Thomas Waldmann ac0803fe0b chunker algorithms: use constants to avoid typos 2019-02-13 04:36:09 +01:00
Thomas Waldmann c4ffbd2a17 prepare to support multiple chunkers 2019-02-13 04:24:14 +01:00
TW b204201fb5
Merge pull request #4302 from ThomasWaldmann/repair-output
add archive name to check --repair output, fixes #3447
2019-02-04 03:29:58 +01:00
TW c3f40de606
cache_sync: compute size/count stats, borg info: consider part files (#4286)
cache_sync: compute size/count stats, borg info: consider part files

fixes #3522
2019-02-04 03:26:45 +01:00
Thomas Waldmann 18b62f63a6 add archive name to check --repair output, fixes #3447
so it does not look like duplicated and also informs the user about
affected archives.
2019-02-01 23:30:45 +01:00
Emmo Emminghaus 733a2bfa30 Introduce borg.platformflags.is_<os> 2018-11-10 23:34:43 +01:00
Emmo Emminghaus 558ca61d20 remove posix issues and fixup for unsupported methodes 2018-11-10 21:48:46 +01:00
Emmo Emminghaus b997d5ba5b move code from borg.helpers.usergroup to borg.platform.posix 2018-11-10 21:43:45 +01:00
Thomas Waldmann 10cdadb2f8 flake8: fix F841 2018-10-29 12:36:03 +01:00
Thomas Waldmann 3c173cc03b wrap msgpack, fixes #3632, fixes #2738
wrap msgpack to avoid future upstream api changes making troubles
or that we would have to globally spoil our code with extra params.

make sure the packing is always with use_bin_type=False,
thus generating "old" msgpack format (as borg always did) from
bytes objects.

make sure the unpacking is always with raw=True,
thus generating bytes objects.

note:

safe unicode encoding/decoding for some kinds of data types is done in Item
class (see item.pyx), so it is enough if we care for bytes objects on the
msgpack level.

also wrap exception handling, so borg code can catch msgpack specific
exceptions even if the upstream msgpack code raises way too generic
exceptions typed Exception, TypeError or ValueError.
We use own Exception classes for this, upstream classes are deprecated
2018-08-06 17:32:55 +02:00
Thomas Waldmann d2e2f1b89d call socket.gethostname only once 2018-08-04 17:40:40 +02:00
Thomas Waldmann de4afa097c separate borg compact command, fixes #2195 2018-07-14 14:29:28 +02:00
Thomas Waldmann 13e6970437 create: do not give chunker a py file object, it is not needed
the os level file handle is enough, the chunker will prefer it if
valid and won't use the file obj, so we can give None there.

this saves these unneeded syscalls:

fstat(5, {st_mode=S_IFREG|0664, st_size=227063, ...}) = 0
ioctl(5, TCGETS, 0x7ffd635635f0)  = -1 ENOTTY (Inappropriate ioctl for device)
lseek(5, 0, SEEK_CUR)             = 0
2018-07-07 18:06:57 +02:00
Thomas Waldmann 018b62c845 bsdflags: use fd instead of path
this optimization is only needed for linux, the bsd-like platforms
do not need an open file to run a ioctl against, but have bsdflags
in the stat result already.

on linux, this optimization saves 1 file open/close per input file.
2018-07-07 17:30:17 +02:00
Thomas Waldmann 7e47e68e29 acls: use fd instead of path 2018-07-07 17:02:37 +02:00
Thomas Waldmann 113b0eabec xattr: use fd for get_all
when processing regular files, use a fd to query xattrs.

when the file was modified and we chunked it, we have it open anyways.

if not, we open the file once and then query xattrs, in the hope that
this is more efficient than the path based calls.

guess it is less prone to race conditions in any case.
2018-07-07 15:47:56 +02:00
Thomas Waldmann 394d59e6d8 xattr: implement set_all to complement get_all
also: follow_symlinks param defaults to False (we do never use True)

fix tests, xattrs are set via FD now.
2018-07-07 15:47:56 +02:00
Thomas Waldmann c29c3063b0 xattr: use bytes typed path for listxattr, getxattr, setxattr 2018-07-07 15:47:56 +02:00
Thomas Waldmann 9deb90db71 xattr: use bytes typed names for listxattr, getxattr, setxattr 2018-07-07 15:47:56 +02:00
Thomas Waldmann b5a9ac5682 xattr: use bytes typed values for listattr, getxattr, setxattr
- getxattr should only return bytes, not None
- setxattr should not get a None value, just bytes
- remove unneeded tmp vars
2018-07-07 15:47:56 +02:00
Thomas Waldmann de113bab23 move capacity calculation to IndexBase, fixes #2646
we just give how many "usable" hashtable entries we want and it computes
the hashtable capacity internally via int(usable / MAX_LOAD_FACTOR).
2018-06-12 22:25:27 +02:00
Thomas Waldmann e064fcd99b borg check: show progress while rebuilding missing manifest, fixes #3787
(cherry picked from commit 85bc590c75)
2018-05-19 01:28:55 +02:00
Thomas Waldmann 7792cec03a borg check: fixup for "deleting orphaned objs" msgs, fixes #3795
only output msgs if there is actually something to delete.
be more precise, show count of orphaned / superseded objects.

(cherry picked from commit d671e9acf2)
2018-05-18 22:05:38 +02:00
Thomas Waldmann be4fdee3ae more borg check --repair output
(cherry picked from commit e6e1d18f9a)
2018-05-18 22:03:03 +02:00
Thomas Waldmann 1ee4397c1c xattrs: fix borg exception handling on ENOSPC error, fixes #3808
(cherry picked from commit 959beb867b)
2018-05-18 17:27:51 +02:00
TW b80dfc727e
Merge pull request #3725 from ThomasWaldmann/issue-3448
set rc=1 when extracting damaged files, fixes #3448
2018-03-25 20:47:37 +02:00
Thomas Waldmann 232f051c10
cleanup: move "processing files" message to expected place
(now possible as we do not lazy load the files cache any more)
2018-03-24 17:04:20 -07:00
Thomas Waldmann e2f71b5dc3
cleanup: get rid of ignore_inode, replace with cache_mode
ignore_inode == ('i' not in cache_mode)  # i)node
2018-03-24 17:04:20 -07:00
Thomas Waldmann b1e7e7f90a
cleanup: get rid of Cache.do_files, replace with cache_mode
not do_files == (cache_mode == 'd')  # d)isabled
2018-03-24 17:04:20 -07:00
Thomas Waldmann 91e5e231f1
read files cache early, init checkpoint timer after that, see #3394
reading the files cache can take considerable amount of time (a user
reported 1h 42min for a 700MB files cache for a repo with 8M files and
15TB total), so we must init the checkpoint timer after that or borg
will create the checkpoint too early.

creating a checkpoint means (among other stuff) saving the files cache,
which will also take a lot of time in such a case, one time too much.

doing this in a clean way required some refactoring:
- cache_mode is now given to Cache initializer and stored in instance
- the files cache is loaded early in _do_open (if needed)
2018-03-24 17:04:13 -07:00
Thomas Waldmann 1c97efd81e set rc=1 when extracting damaged files, fixes #3448
- size inconsistencies
- file has all-zero replacement chunks

introduced new BackupError exception. when raised while extracting
files, gets handled via emitting a warning, setting rc=1 and
proceeding to next file.
2018-03-25 00:21:06 +01:00
Thomas Waldmann dc48377dc6
fix Archive's checkpoint_interval arg default (300 -> 1800s)
the commandline arg default was already at 1800, so likely this is
only a cosmetic fix.
2018-03-24 16:05:05 -07:00
Thomas Waldmann f979349f07 fix borg recreate --progress (broken by previous commit)
fixup for cb7887836a
2018-03-10 15:41:01 +01:00
Rémi Oudin cb7887836a Fix --progress option. (#3557)
Fix --progress option, fixes #3431
2018-03-10 15:11:08 +01:00
Thomas Waldmann 4e0f369d0a fix borg create never showing M status
the problem was that the upper layer code did not have enough information
about the file, whether it is known or not - and thus, could not decide
correctly whether status should be M)odified or A)dded.

now, file_known_and_unchanged method returns an additional "known"
boolean to fix this.

also: add comment about files cache loading in cache_mode='r'
2018-02-26 11:07:20 +01:00
Alexander 'Leo' Bergolth 74c10e4643 add chunker_params to archive info (at least to json output) 2018-01-25 21:02:39 +01:00
Thomas Waldmann 57a2d920cb check --repair: fix malfunctioning validator, fixes #3444
the major problem was the ('path' in item) expression.
the dict has bytes-typed keys there, so it never succeeded as it
looked for a str key. this is a 1.1 regression, 1.0 was fine.

the dict -> StableDict change is just for being more specific,
the check triggered correctly as StableDict subclasses dict,
it was just a bit too general.

(cherry picked from commit e09892caec)
2017-12-16 21:44:35 +01:00
Sam H b0141c1dc9 include item birthtime in archive (where available) (#3313)
include item birthtime in archive, fixes #3272

* use `safe_ns` when reading birthtime into attributes
* proper order for `birthtime` in `ITEM_KEYS` list
* use `bigint` wrapper for consistency
* Add tests to verify that birthtime is normally preserved, but not preserved when `--nobirthtime` is passed to `borg create`.
2017-11-13 14:55:10 +01:00
Thomas Waldmann 66cd1cd240 stats: do not count data volume twice when checkpointing, fixes #3224 2017-11-05 00:48:17 +01:00
TW 41ccd3d7d1
Merge pull request #3266 from ThomasWaldmann/set-bsdflags-last
set bsdflags last (include immutable flag), fixes #3263
2017-11-04 20:10:34 +01:00
Thomas Waldmann 7aafcc517a recreate: move chunks_healthy when excluding hardlink master, fixes #3228 2017-11-04 18:39:00 +01:00
Thomas Waldmann 90186ad12b get rid of already existing invalid chunks_healthy metadata, see #3218 2017-11-04 18:39:00 +01:00
Thomas Waldmann 7211bb2211 get rid of chunks_healthy when rechunking, fixes #3218 2017-11-04 18:39:00 +01:00
Thomas Waldmann 2c6f9634bc set bsdflags last (include immutable flag), fixes #3263 2017-11-04 15:18:55 +01:00
Thomas Waldmann 427e2ca5fb borg create: fix stats
master branch only (not present in 1.1-maint):

stats were computed at 2 different places, but the summing up was missing.
2017-11-02 18:06:39 +01:00
TW 38dd1f11ac Merge pull request #3181 from ThomasWaldmann/hardlinked-symlink-warning
remove hardlinked symlink warning, update docs
2017-10-17 21:30:53 +02:00
Thomas Waldmann 10adadf685 implement --nobsdflags and --exclude-nodump, fixes #3160
do no read/archive bsdflags: borg create --nobsdflags ...
do not extract/set bsdflags: borg extract --nobsdflags ...

use cases:

- fs shows wrong / random bsdflags (bug in filesystem)
- fs does not support bsdflags anyway
- already archived bsdflags are wrong / unwanted
- borg shows any sort of unwanted effect due to get_flags, esp. on Linux

the nodump flag ("do not backup this file") is not honoured any more by
default because this functionality (esp. if it happened by error or
unexpected) was rather confusing and unexplainable at first to users.

if you want that "do not backup NODUMP-flagged files" behaviour, use:
borg create --exclude-nodump ...
2017-10-17 18:45:32 +02:00
Thomas Waldmann e674822888 remove hardlinked symlinks warning, update docs, fixes #3175
the warning was annoying for people with a lot of such items and
they can not do anything about it anyway.

thus, just document this as a limitation.
2017-10-17 18:34:32 +02:00
Thomas Waldmann 9d6b125e98 borg recreate: correctly compute part file sizes, fixes #3157
when doing in-file checkpointing, borg creates *.borg_part_N files.
complete_file = part_1 + part_2 + ... + part_N

the source item for recreate already has a precomputed (total) size
member, thus we must force recomputation from the (partial) chunks
list to correct the size to be the part's size only.

borg create avoided this problem by computing the size member after
writing all the parts. this is now not required any more.

the bug is mostly cosmetic, borg check will complain, borg extract on
a part file would also complain. but all the complaints only refer to
the wrong metadata of the part files, the part files' contents are
correct.

usually you will never extract or look at part files, but only deal
with the full file, which will be completely valid, all metadata and
content.

you can get rid of the archives with these cosmetic errors by running
borg recreate on them with a fixed borg version. the old part files
will get dropped (because they are usually ignored) and any new part
file created due to checkpointing will be correct.
2017-10-14 04:24:26 +02:00
TW 13a4439bb8 Merge pull request #3120 from ThomasWaldmann/fix-nonlocal-path-detection
fix detection of non-local path, fixes #3108
2017-10-11 01:01:17 +02:00
Thomas Waldmann 60e9249100 fix detection of non-local path, fixes #3108
filenames like ..foobar are valid, so, to detect stuff in upper dirs,
we need to include the path separator and check if it starts with '../'.
2017-10-10 01:36:44 +02:00
Thomas Waldmann 9d3daebd5f recreate: don't crash on attic archives w/o time_end, fixes #3109 2017-10-10 01:17:56 +02:00
Thomas Waldmann 5e2de8ba67 implement files cache mode control, fixes #911
You can now control the files cache mode using this option:

--files-cache={ctime,mtime,size,inode,rechunk,disabled}*

(only some combinations are supported)

Previously, only these modes were supported:
- mtime,size,inode (default of borg < 1.1.0rc4)
- mtime,size (by using --ignore-inode)
- disabled (by using --no-files-cache)

Now, you additionally get:
- ctime alternatively to mtime (more safe), e.g.:
  ctime,size,inode (this is the new default of borg >= 1.1.0rc4)
- rechunk (consider all files as changed, rechunk them)

Deprecated:
- --ignore-inodes (use modes without "inode")
- --no-files-cache (use "disabled" mode)

The tests needed some changes:
- previously, we use os.utime() to set a files mtime (atime) to specific
  values, but that does not work for ctime.
- now use time.sleep() to create the "latest file" that usually does
  not end up in the files cache (see FAQ)
2017-10-01 00:52:32 +02:00
Thomas Waldmann 928bde8676 get rid of datetime.isoformat to avoid bugs like #2994 2017-09-07 14:11:07 +02:00
TW 95d267493e Merge pull request #2959 from ThomasWaldmann/fix-timestamp-option
borg create --timestamp: set start time, fixes #2957
2017-08-25 04:36:44 +02:00
Thomas Waldmann 8a299ae24c borg create --timestamp: set start time, fixes #2957 2017-08-24 04:07:37 +02:00
enkore 1ac49380b1 Merge pull request #2925 from enkore/issue/2376
Datetime formatting
2017-08-22 17:33:17 +02:00
Marian Beermann a836f451ab one datetime formatter to rule them all
# Conflicts:
#	src/borg/helpers.py
2017-08-22 17:32:21 +02:00
Simon Frei 9dc22d230f Refactor the diff functionality
This factors out a lot of the logic in do_diff in archiver.py to Archive in
archive.py and a new class ItemDiff in item.pyx. The idea is to move methods
to the classes that are affected and to make it reusable, primarily for a new
option to fuse (#2475).
2017-08-13 21:23:04 +02:00
Simon Frei 9f6df7d999 Only move and change indendation of code - NOT functional 2017-08-06 01:42:32 +02:00
Marian Beermann a88519d540 archive: delete unused Archive.list_archives 2017-07-29 19:37:37 +02:00
Marian Beermann c93dba0195 archive: create FilesystemObjectProcessors class 2017-07-29 19:37:37 +02:00
Thomas Waldmann 8752039bec integrate new crypto code 2017-07-27 23:33:15 +02:00