allow creating archives using stdout of given command
In addition to allowing:
some-command --param value | borg create REPO::ARCH -
also allow:
borg create --content-from-command create REPO::ARCH -- some-command --param value
The difference is that the latter approach deals with errors properly.
In the former example, an archive is created no matter what. Even, if
`some-command` aborts and the output is truncated, Borg won't realize.
In the latter example, the status code is checked and archive creation
is aborted properly when appropriate.
The locally defined preload() function overwrites the preload boolean keyword
argument, always evaluating to true, so preloading is done, even when not
requested by the caller, causing a memory leak.
Also move its definition outside of the loop.
This issue was found by Antonio Larrosa in borg issue #5202.
The code used for error reporting crashes due to an invalid utf-8
sequence. Use errors='replace' to never crash there. Errors
are expected in input data when borg check is run.
support platforms with no os.link, fixes#4901
if we don't have os.link, we just extract another copy instead of making a hardlink.
for that to work, we need to have (and keep) the chunks list in hardlink_masters.
if the file is not a regular file, but a hardlink slave with a not
extracted hardlink master, chunks will be None and we must not call
preload(chunks).
(cherry picked from commit 291d58efa1)
On windows os.open does not work for directories.
If borg tries to open an directory on windows, None is returned
as file descriptor. The archive and archiver where adjusted to
handle the case if a file descriptor is None.
on linux, acls are based on xattrs, so do these closeby:
1. listxattr -> keys (without acl related keys)
2. for all keys: getxattr
3. acl-related getxattr by acl library
for fd-based operations, we would have to open the file, but for
char / block devices this has unwanted effects, even if we do not
read from the device.
thus, we use path (or dir_fd + name) based ops here.
races via changing path components can be avoided by opening the
parent directory and using parent_fd + file_name combination with
*at style functions to access the directories' contents.
wrap msgpack to avoid future upstream api changes making troubles
or that we would have to globally spoil our code with extra params.
make sure the packing is always with use_bin_type=False,
thus generating "old" msgpack format (as borg always did) from
bytes objects.
make sure the unpacking is always with raw=True,
thus generating bytes objects.
note:
safe unicode encoding/decoding for some kinds of data types is done in Item
class (see item.pyx), so it is enough if we care for bytes objects on the
msgpack level.
also wrap exception handling, so borg code can catch msgpack specific
exceptions even if the upstream msgpack code raises way too generic
exceptions typed Exception, TypeError or ValueError.
We use own Exception classes for this, upstream classes are deprecated
the os level file handle is enough, the chunker will prefer it if
valid and won't use the file obj, so we can give None there.
this saves these unneeded syscalls:
fstat(5, {st_mode=S_IFREG|0664, st_size=227063, ...}) = 0
ioctl(5, TCGETS, 0x7ffd635635f0) = -1 ENOTTY (Inappropriate ioctl for device)
lseek(5, 0, SEEK_CUR) = 0
this optimization is only needed for linux, the bsd-like platforms
do not need an open file to run a ioctl against, but have bsdflags
in the stat result already.
on linux, this optimization saves 1 file open/close per input file.
when processing regular files, use a fd to query xattrs.
when the file was modified and we chunked it, we have it open anyways.
if not, we open the file once and then query xattrs, in the hope that
this is more efficient than the path based calls.
guess it is less prone to race conditions in any case.