Use with caution: permanent data loss by specifying incorrect patterns
is easily possible. Make a dry run to make sure you got everything right.
borg recreate has many uses:
- Can selectively remove files/dirs from old archives, e.g. to free
space or purging picturarum biggus dickus from history
- Recompress data
- Rechunkify data, to have upgraded Attic / Borg 0.xx archives deduplicate
with Borg 1.x archives. (Or to experiment with chunker-params for
specific use cases
It is interrupt- and resumable.
Chunks are not freed on-the-fly.
Rationale:
Makes only sense when rechunkifying, but logic on which new chunks to
free what input chunks is complicated and *very* delicate.
Future TODOs:
- Refactor tests using py.test fixtures
-- would require porting ArchiverTestCase to py.test: many changes,
this changeset is already borderline too large.
- Possibly add a --target option to not replace the source archive
-- with the target possibly in another Repo
(better than "cp" due to full integrity checking, and deduplication
at the target)
- Detect and skip (unless --always-recompress) already recompressed chunks
Fixes#787#686#630#70 (and probably some I overlooked)
Also see #757 and #770
- Group options
- Nicer list of options in Sphinx
- Deduplicate 'Common options'
(including --help)
The latter is done by explicitly declaring --help in the common_parser,
which is then inherited by the sub-parsers; no change in observable
behaviour.
ubuntu was showing up twice in the list of supported OSes... it seems it was because the line was getting too long, so I removed the "names" and kept only the numbers to keep the line short.
- Archives now have a new metadata field 'comment'
- 'info' command shows a comment if it's present
- 'create' command now has option '--comment' for adding comments to archives.
- A new command 'comment' is added for modifying the comments on existing
archives.
Resolves#842.
it's not recommended to suppress warnings or errors,
but the user may decide this on his own.
note: --warning is not given to borg serve so a <= 1.0.0 borg
will still work as server. it is not needed as it is the default.
Previously, on 'borg diff', the output always had first the modifications, then
additions, and finally removals. Output may be easier to follow if the various
kinds of changes are interleaved. This commit is a simple solution that first
collects the output lines and sorts them by file path before printing. This new
behavior is optional and disabled by default. It can be enabled with '--sort'
command line option.
This option will be especially useful after the planned multi-threading changes
arrive. Multi-threading may shuffle the archive order of files making diff
output hard to follow without sorting.
Resolves#797.
Makes it easy to see paths excluded by --exclude* options for testing of
regexes, and for ongoing monitoring that files desired for backup aren't
getting excluded accidentally.
The main design goals of the new format:
- One file takes exactly one line of output
- The format is easy to read with typical, long list of changes
- Metadata changes are visible and easy to examine
- Unuseful information is not shown
Resolves#757.
- add archiver.main_mount()
- provide borgfs behaviour when the monolithic binary is called via a
symlink called borgfs
- docs: update usage of mount subcommand, provide examples for borgfs and
add symlink creation to standalone binary installation
- run build_usage
- add entry point in setup.py
- patch helpers.py:get_keys_dir() to allow mounting fstab entries with
"user" option set
Without this, setuid() called at some point by mount changes the HOME
environment variable to '/root' and os.expanduser('~') would return
'/root' as well, thus the mount would fail with
PermissionError: [Errno 13] Permission denied: '/root/.config'
After setuid(), the HOME variable stays intact, so we still can
explicitly query USER's home.
Also, os.path.expanduser() behaves differently for '~' and '~someuser'
as parameters: when called with an explicit username, the possibly set
environment variable HOME is no longer respected. So we have to check if
it is set and only expand the user's home directory if HOME is unset.
- add myself to AUTHORS
it is important to first do all the changes that modify the release contents,
then tag/sign it, then build the binaries via vagrant.
Only then the binaries will have the correct version number.
refactorings:
- introduced concept of default answer:
if the answer string is in the defaultish sequence, the return value of yes() will be the default.
e.g. if just pressing <enter> when asked on the console or if an empty string or "default" is
in the environment variable for overriding.
if an environment var has an invalid value and no retries are enabled: return default
if retries are enabled, next retry won't use the env var again, but either ask via input().
- simplify:
only one default - this should be a SAFE default as it is used in some special conditions
like EOF or invalid input with retries disallowed.
no isatty() magic, the "yes" shell command exists, so we could receive input even if it is not from a tty.
- clean:
separate retry flag from retry_msg
one of the biggest issues with borg < 1.0 was that it had a default target chunk
size of 64kiB, thus it created a lot of chunks, a huge chunk management overhead
(high RAM and disk usage).
The fnmatch module in Python's standard library implements a pattern
format for paths which is similar to shell patterns. However, “*”
matches any character including path separators. This newly introduced
pattern syntax with the selector “sh” no longer matches the path
separator with “*”. Instead “**/” can be used to match zero or more
directory levels.
the items metadata stream is usually not that big (compared to the file content data) -
it is just file and dir names and other metadata.
if we use too rough granularity there (and big minimum chunk size), we usually will get no deduplication.
The existing option to exclude files and directories, “--exclude”, is
implemented using fnmatch[1]. fnmatch matches the slash (“/”) with “*”
and thus makes it impossible to write patterns where a directory with
a given name should be excluded at a specific depth in the directory
hierarchy, but not anywhere else. Consider this structure:
home/
home/aaa
home/aaa/.thumbnails
home/user
home/user/img
home/user/img/.thumbnails
fnmatch incorrectly excludes “home/user/img/.thumbnails” with a pattern
of “home/*/.thumbnails” when the intention is to exclude “.thumbnails”
in all home directories while retaining directories with the same name
in all other locations.
With this change regular expressions are introduced as an additional
pattern syntax. The syntax is selected using a prefix on “--exclude”'s
value. “re:” is for regular expression and “fm:”, the default, selects
fnmatch. Selecting the syntax is necessary when regular expressions are
desired or when the desired fnmatch pattern starts with two alphanumeric
characters followed by a colon (i.e. “aa:something/*”). The exclusion
described above can be implemented as follows:
--exclude 're:^home/[^/]+/\.thumbnails$'
The “--exclude-from” option permits loading exclusions from a text file
where the same prefixes can now be used, e.g. “re:\.tmp$”.
The documentation has been extended and now not only describes the two
pattern styles, but also the file format supported by “--exclude-from”.
This change has been discussed in issue #43 and in change request #497.
[1] https://docs.python.org/3/library/fnmatch.html
Signed-off-by: Michael Hanselmann <public@hansmi.ch>
removed --log-level due to overlap with how --verbose works now.
for consistency, added --info as alias to --verbose (as the effect is
setting INFO log level).
also added --debug which sets DEBUG log level.
note: there are no messages emitted at DEBUG level yet.
WARNING is the default (because we want mostly silent behaviour,
except if something serious happens), so we don't need --warning
as an option.
the problem here was that we do not just have changed and unchanged items,
but also a lot of items besides regular files which we just back up "as is" without
determining whether they are changed or not. thus, we can't support changed/unchanged
in a way users would expect them to work.
the A/M/U status only applies to the data content of regular files (compared to the index).
for all items, we ALWAYS save the metadata, there is no changed / not changed detection there.
thus, I replaced this with a --filter option where you can just specify which
status chars you want to see listed in the output.
E.g. --filter AM will only show regular files with A(dded) or M(odified) state, but nothing else.
Not giving --filter defaults to showing all items no matter what status they have.
Output is emitted via logger at info level, so it won't show up except if the logger is at that level.
also: remove mailing list and irc channel address from development docs,
it is enough to have this information on the main page and on the support page.
the generation of those files was causing us way too much pain to
justify automatically generating them all the time.
those will have to be re-generated with `build_api` or `build_usage`
as appropriate, for example when function signatures or commandline
flags change.
see #384
this was making us require mock, which is really a test component and
shouldn't be part of the runtime dependencies. furthermore, it was
making the imports and the code more brittle: it may have been
possible that, through an environment variable, backups could be
corrupted because mock libraries would be configured instead of real
once, which is a risk we shouldn't be taking.
finally, this was used only to build docs, which we will build and
commit to git by hand with a fully working borg when relevant.
see #384.
Moved the list of dependencies to the corresponding subsection.
Collected all preparation steps under one heading.
Added link to the Arch Linux AUR package.
Install docs for OS X.
This is what attic used by default, but borgbackup defaults to "no compression".
I just adjusted the command invocation, so we can keep the example output
(which shows that stuff was compressed).
Also: add FAQ item about compression.
In one case removed the |project_name| and |git_url| variables to fix
the display of the code block. Shouldn't be problematic, as they are not
used consistently in this document anyway.
Put two notes in their own nice '.. note::' blocks.
this still doesn't quite work: our sidebar is gone, so no more useful
links and related projects. we also loose the link to github and the
RTD popup, although the latter still needs to be confirmed on RTD
infra
instead of applying this only to usage generation, use it as a generic
mechanism to disable loading of Cython code.
it may be incomplete: there may be other places where Cython code is
loaded that is not checked, but that is sufficient to build the usage
docs. the environment variable used is documented as such in the
docs/usage.rst.
we also move the check to a helper function and document it
better. this has the unfortunate side effect of moving includes
around, but I can't think of a better way.
this is an unfortunate rewrite of the manpage creation code mentionned
in #208. ideally, this would be rewritten into a class that can
generate both man pages and .rst files.
instead of a boring table of contents, try to show our more exciting README file
it's still a wall of text, but at least all the buzzwords and highlights are there
ideally, the table of contents would be in the sidebar, but i don't know how to do that
link to the locations of different tools when I know them. i marked
the ones I don't know about specially so we can document those as
well.
point to the Github releases for the standalone binaries upload
we stop supporting them, because there are better alternatives:
- use a distribution package (from your linux distribution), if available
- use a pyinstaller binary provided by us (they include all you need in 1 file and
thus have better compatibility properties and are easier to install than a wheel)
- install from source (pypi or git) if everything else fails
while SSH options can be specified through `~/.ssh/config`, some users
may want to use a completely different SSH command for their backups,
without overriding their $PATH variable. it may also be easier to do
ad-hoc configuration and tests that way.
plus, the POLA tells us that users expects something like this to be
supported by commands that talk to ssh. it is supported by rsync, git
and so on.
this is a crude hack for now, and could use a better table of contents
but at least we have some way of linking and showing the different
internal functions
the next phase here is obviously to document that API through the
addition of docstrings. a static api.rst could also be easier to read,
but maybe that could go through some docstrings as well, to be tested
right now, the update_usage script regenerates the usage files at
every call
by moving this into the makefile, we make those files be generated
only when the source file change, which makes testing docs much faster
- reduce redundancy (platforms are documented in README.rst)
- reformat to 80 chars width
- clarify checkpoints
- remove workarounds for stuff that was fixed
sets the default repository to use, e.g. like:
export BORG_REPO=/mnt/backup/repo
borg init
borg create ::archive
borg list
borg mount :: /mnt
fusermount -u /mnt
borg delete ::archive
outdated - it just showed different levels of zlib compression,
but not we additionally have "lzma", "lz4" and "none" compression.
the "usage" and "internals" docs give some hints about them, too.
found out why it could not install llfuse into virtual env: it always complained about
not being able to find fuse.pc - which is part of libfuse-dev / fuse-devel and was missing.
once one adds the fuse dev stuff, llfuse installs to virtual env without problems.
README.rst (shown on github and also at the start of the html docs) shall
be like an elevator speech - convince readers in a very short time.
this is most important, everything else can come after we got the reader's interest.
include README into docs to avoid duplication.
also include CHANGES into docs.
add developer docs, move examples from tox.ini there
add separate support docs
remove glossary, most of what was there can be understood by an admin from context
move attic and compatibility note to the end
faq about redundancy / integrity
compression is optional
having borg installed on backup server is optional (but faster)
cygwin installation tipps
do not document passphrase encryption mode example, use keyfile mode
it seems like there is currently no bureaucracy required, freenode web site says group registration is suspended.
i also asked on the freenode channel, they said just make sure you are right here and use it. so we do that now.
- use power-of-2 sizes / n bit hash mask so one can give them more easily
- chunker api: give seed first, so we can give *chunker_params after it
- fix some tests that aren't possible with 2^N
- make sparse file extraction zero detection flexible for variable chunk max size
UNIX domain sockets: explain why not, see attic issue #259
Symlinks: say that they are backed up as is and not followed, replacement for attic PR #294
Sparse files: explain what the "simple" in simple sparse file support means.
Plus some other explanations / mentions that were missing.
Thanks to @lfam (attic PR #277 )!
Note: As I already had refactored a lot of these pathes you changed, it was easier
to just cherry pick the hunks with the other changes and apply them manually.
it's pretty useless to have .borg as a directory extension, especially
since there is a README in there stating that this is a borg repo.
conistency:
"backup" is always used as relative backup repository path
"/mnt/backup" is always used as absolute repository path
use borg instead attic except at the places where it was used:
- as toplevel package name, directory name, file name
- to refer to original attic
remove sphinx upload make command, will be replaced by github.io site later
remove references to binary downloads and linux packages for now
remove some software name references, fix grammar
use borgbackup rather than borg-backup (or borg) in URLs,
less name collision issues, better search results, no validity issues with "-"
this should address #27, #28 and #29 at least at a basic level
it is mostly based on the mailing list discussion mentionned in #27,
with some reformatting and merging of different posts.
this is still incomplete as it only describes key files, but doesn't
clearly say how chunks are encrypted or decrypted.
this address parts of #29 but eventually that document should also
cover #27, #28 and maybe #45
this is written with recent Ubuntu and Debian in mind, but should be
working everywhere. the idea here is to make sure anyone can install
this without knowning too much about ACLs or anything similar.
closes#135
it seems that hardlinks are supported, but were not explicitely documented in the documentation. the FAQ seems like the right place to do this. closes#133.