merged master

This commit is contained in:
Thomas Waldmann 2015-08-09 23:51:46 +02:00
commit 8af3aa3397
32 changed files with 600 additions and 349 deletions

17
.coveragerc Normal file
View File

@ -0,0 +1,17 @@
[run]
branch = True
source = borg
omit =
borg/__init__.py
borg/__main__.py
borg/_version.py
[report]
exclude_lines =
pragma: no cover
def __repr__
raise AssertionError
raise NotImplementedError
if 0:
if __name__ == .__main__.:
ignore_errors = True

1
.gitignore vendored
View File

@ -21,3 +21,4 @@ docs/usage/*.inc
borg.build/
borg.dist/
borg.exe
.coverage

View File

@ -1,15 +1,47 @@
sudo: required
language: python
python:
- "3.2"
- "3.3"
- "3.4"
# command to install dependencies
cache:
directories:
- $HOME/.cache/pip
matrix:
include:
- python: 3.2
os: linux
env: TOXENV=py32
- python: 3.3
os: linux
env: TOXENV=py33
- python: 3.4
os: linux
env: TOXENV=py34
- language: generic
os: osx
osx_image: xcode6.4
env: TOXENV=py32
- language: generic
os: osx
osx_image: xcode6.4
env: TOXENV=py33
- language: generic
os: osx
osx_image: xcode6.4
env: TOXENV=py34
install:
- "sudo add-apt-repository -y ppa:gezakovacs/lz4"
- "sudo apt-get update"
- "sudo apt-get install -y liblz4-dev"
- "sudo apt-get install -y libacl1-dev"
- "pip install --use-mirrors Cython"
- "pip install -e ."
# command to run tests
script: fakeroot -u py.test
- ./.travis/install.sh
script:
- ./.travis/run.sh
after_success:
- ./.travis/upload_coverage.sh
notifications:
irc:
channels:
- "irc.freenode.org#borgbackup"
use_notice: true
skip_join: true

46
.travis/install.sh Executable file
View File

@ -0,0 +1,46 @@
#!/bin/bash
set -e
set -x
if [[ "$(uname -s)" == 'Darwin' ]]; then
brew update || brew update
if [[ "${OPENSSL}" != "0.9.8" ]]; then
brew outdated openssl || brew upgrade openssl
fi
if which pyenv > /dev/null; then
eval "$(pyenv init -)"
fi
brew outdated pyenv || brew upgrade pyenv
case "${TOXENV}" in
py32)
pyenv install 3.2.6
pyenv global 3.2.6
;;
py33)
pyenv install 3.3.6
pyenv global 3.3.6
;;
py34)
pyenv install 3.4.3
pyenv global 3.4.3
;;
esac
pyenv rehash
python -m pip install --user virtualenv
else
pip install virtualenv
sudo add-apt-repository -y ppa:gezakovacs/lz4
sudo apt-get update
sudo apt-get install -y liblz4-dev
sudo apt-get install -y libacl1-dev
fi
python -m virtualenv ~/.venv
source ~/.venv/bin/activate
pip install tox pytest pytest-cov codecov Cython
pip install -e .

23
.travis/run.sh Executable file
View File

@ -0,0 +1,23 @@
#!/bin/bash
set -e
set -x
if [[ "$(uname -s)" == "Darwin" ]]; then
eval "$(pyenv init -)"
if [[ "${OPENSSL}" != "0.9.8" ]]; then
# set our flags to use homebrew openssl
export ARCHFLAGS="-arch x86_64"
export LDFLAGS="-L/usr/local/opt/openssl/lib"
export CFLAGS="-I/usr/local/opt/openssl/include"
fi
fi
source ~/.venv/bin/activate
if [[ "$(uname -s)" == "Darwin" ]]; then
# no fakeroot on OS X
sudo tox -e $TOXENV
else
fakeroot -u tox
fi

13
.travis/upload_coverage.sh Executable file
View File

@ -0,0 +1,13 @@
#!/bin/bash
set -e
set -x
NO_COVERAGE_TOXENVS=(pep8)
if ! [[ "${NO_COVERAGE_TOXENVS[*]}" =~ "${TOXENV}" ]]; then
source ~/.venv/bin/activate
ln .tox/.coverage .coverage
# on osx, tests run as root, need access to .coverage
sudo chmod 666 .coverage
codecov -e TRAVIS_OS_NAME TOXENV
fi

View File

@ -5,6 +5,22 @@ Borg Changelog
Version 0.24.0
--------------
Incompatible changes (compared to 0.23):
- borg now always issues --umask NNN option when invoking another borg via ssh
on the repository server. By that, it's making sure it uses the same umask
for remote repos as for local ones. Because of this, you must upgrade both
server and client(s) to 0.24.
- the default umask is 077 now (if you do not specify via --umask) which might
be a different one as you used previously. The default umask avoids that
you accidentally give access permissions for group and/or others to files
created by borg (e.g. the repository).
Deprecations:
- "--encryption passphrase" mode is deprecated, see #85 and #97.
See the new "--encryption repokey" mode for a replacement.
New features:
- borg create --chunker-params ... to configure the chunker, fixes #16
@ -17,12 +33,21 @@ New features:
- borg create --compression 0..9 to select zlib compression level, fixes #66
(attic #295).
- borg init --encryption repokey (to store the encryption key into the repo),
deprecate --encryption passphrase, fixes #85
fixes #85
- improve at-end error logging, always log exceptions and set exit_code=1
- LoggedIO: better error checks / exceptions / exception handling
- implement --remote-path to allow non-default-path borg locations, #125
- implement --umask M and use 077 as default umask for better security, #117
- borg check: give a named single archive to it, fixes #139
- cache sync: show progress indication
- cache sync: reimplement the chunk index merging in C
Bug fixes:
- fix segfault that happened for unreadable files (chunker: n needs to be a
signed size_t), #116
- fix the repair mode, #144
- repo delete: add destroy to allowed rpc methods, fixes issue #114
- more compatible repository locking code (based on mkdir), maybe fixes #92
(attic #317, attic #201).
- better Exception msg if no Borg is installed on the remote repo server, #56
@ -30,10 +55,12 @@ Bug fixes:
fixes attic #326.
- fix Traceback when running check --repair, attic #232
- clarify help text, fixes #73.
- add help string for --no-files-cache, fixes #140
Other changes:
- improved docs:
- added docs/misc directory for misc. writeups that won't be included
"as is" into the html docs.
- document environment variables and return codes (attic #324, attic #52)
@ -44,14 +71,25 @@ Other changes:
- add FAQ entries about redundancy / integrity
- clarify that borg extract uses the cwd as extraction target
- update internals doc about chunker params, memory usage and compression
- added docs about development
- add some words about resource usage in general
- document how to backup a raw disk
- add note about how to run borg from virtual env
- add solutions for (ll)fuse installation problems
- document what borg check does, fixes #138
- reorganize borgbackup.github.io sidebar, prev/next at top
- deduplicate and refactor the docs / README.rst
- use borg-tmp as prefix for temporary files / directories
- short prune options without "keep-" are deprecated, do not suggest them
- improved tox configuration, documented there how to invoke it
- improved tox configuration
- remove usage of unittest.mock, always use mock from pypi
- use entrypoints instead of scripts, for better use of the wheel format and
modern installs
- add requirements.d/development.txt and modify tox.ini
- use travis-ci for testing based on Linux and (new) OS X
- use coverage.py, pytest-cov and codecov.io for test coverage support
I forgot to list some stuff already implemented in 0.23.0, here they are:
New features:

View File

@ -1,4 +1,4 @@
include README.rst AUTHORS LICENSE CHANGES MANIFEST.in versioneer.py
include README.rst AUTHORS LICENSE CHANGES.rst MANIFEST.in versioneer.py
recursive-include borg *.pyx
recursive-include docs *
recursive-exclude docs *.pyc

View File

@ -1,13 +1,111 @@
|build|
What is BorgBackup?
-------------------
BorgBackup (short: Borg) is a deduplicating backup program.
Optionally, it supports compression and authenticated encryption.
What is Borg?
-------------
Borg is a deduplicating backup program. The main goal of Borg is to provide
an efficient and secure way to backup data. The data deduplication
technique used makes Borg suitable for daily backups since only changes
are stored.
The main goal of Borg is to provide an efficient and secure way to backup data.
The data deduplication technique used makes Borg suitable for daily backups
since only changes are stored.
The authenticated encryption technique makes it suitable for backups to not
fully trusted targets.
Borg is a fork of `Attic <https://github.com/jborg/attic>`_ and maintained by "`The Borg Collective <https://github.com/borgbackup/borg/blob/master/AUTHORS>`_".
`Borg Installation docs <http://borgbackup.github.io/borgbackup/installation.html>`_
Main features
~~~~~~~~~~~~~
**Space efficient storage**
Deduplication based on content-defined chunking is used to reduce the number
of bytes stored: each file is split into a number of variable length chunks
and only chunks that have never been seen before are added to the repository.
To deduplicate, all the chunks in the same repository are considered, no
matter whether they come from different machines, from previous backups,
from the same backup or even from the same single file.
Compared to other deduplication approaches, this method does NOT depend on:
* file/directory names staying the same
So you can move your stuff around without killing the deduplication,
even between machines sharing a repo.
* complete files or time stamps staying the same
If a big file changes a little, only a few new chunks will be stored -
this is great for VMs or raw disks.
* the absolute position of a data chunk inside a file
Stuff may get shifted and will still be found by the deduplication
algorithm.
**Speed**
* performance critical code (chunking, compression, encryption) is
implemented in C/Cython
* local caching of files/chunks index data
* quick detection of unmodified files
**Data encryption**
All data can be protected using 256-bit AES encryption, data integrity and
authenticity is verified using HMAC-SHA256.
**Compression**
All data can be compressed by zlib, level 0-9.
**Off-site backups**
Borg can store data on any remote host accessible over SSH. If Borg is
installed on the remote host, big performance gains can be achieved
compared to using a network filesystem (sshfs, nfs, ...).
**Backups mountable as filesystems**
Backup archives are mountable as userspace filesystems for easy interactive
backup examination and restores (e.g. by using a regular file manager).
**Platforms Borg works on**
* Linux
* FreeBSD
* Mac OS X
* Cygwin (unsupported)
**Free and Open Source Software**
* security and functionality can be audited independently
* licensed under the BSD (3-clause) license
Easy to use
~~~~~~~~~~~
Initialize a new backup repository and create a backup archive::
$ borg init /mnt/backup
$ borg create /mnt/backup::Monday ~/Documents
Now doing another backup, just to show off the great deduplication::
$ borg create --stats /mnt/backup::Tuesday ~/Documents
Archive name: Tuesday
Archive fingerprint: 387a5e3f9b0e792e91c...
Start time: Tue Mar 25 12:00:10 2014
End time: Tue Mar 25 12:00:10 2014
Duration: 0.08 seconds
Number of files: 358
Original size Compressed size Deduplicated size
This archive: 57.16 MB 46.78 MB 151.67 kB <--- !
All archives: 114.02 MB 93.46 MB 44.81 MB
For a graphical frontend refer to our complementary project
`BorgWeb <https://github.com/borgbackup/borgweb>`_.
Notes
-----
Borg is a fork of `Attic <https://github.com/jborg/attic>`_ and maintained by
"`The Borg Collective <https://github.com/borgbackup/borg/blob/master/AUTHORS>`_".
Read `issue #1 <https://github.com/borgbackup/borg/issues/1>`_ about the initial
considerations regarding project goals and policy of the Borg project.
BORG IS NOT COMPATIBLE WITH ORIGINAL ATTIC.
EXPECT THAT WE WILL BREAK COMPATIBILITY REPEATEDLY WHEN MAJOR RELEASE NUMBER
@ -17,80 +115,15 @@ NOT RELEASED DEVELOPMENT VERSIONS HAVE UNKNOWN COMPATIBILITY PROPERTIES.
THIS IS SOFTWARE IN DEVELOPMENT, DECIDE YOURSELF WHETHER IT FITS YOUR NEEDS.
Read `issue #1 <https://github.com/borgbackup/borg/issues/1>`_ on the issue tracker, goals are being defined there.
For more information, please also see the
`LICENSE <https://github.com/borgbackup/borg/blob/master/LICENSE>`_.
Please also see the `LICENSE <https://github.com/borgbackup/borg/blob/master/LICENSE>`_ for more informations.
Easy to use
~~~~~~~~~~~
Initialize backup repository and create a backup archive::
$ borg init /mnt/backup
$ borg create -v /mnt/backup::documents ~/Documents
For a graphical frontend refer to our complementary project `BorgWeb <https://github.com/borgbackup/borgweb>`_.
Main features
~~~~~~~~~~~~~
Space efficient storage
Variable block size deduplication is used to reduce the number of bytes
stored by detecting redundant data. Each file is split into a number of
variable length chunks and only chunks that have never been seen before are
compressed and added to the repository.
The content-defined chunking based deduplication is applied to remove
duplicate chunks within:
* the current backup data set (even inside single files / streams)
* current and previous backups of same machine
* all the chunks in the same repository, even if coming from other machines
This advanced deduplication method does NOT depend on:
* file/directory names staying the same (so you can move your stuff around
without killing the deduplication, even between machines sharing a repo)
* complete files or time stamps staying the same (if a big file changes a
little, only a few new chunks will be stored - this is great for VMs or
raw disks)
* the absolute position of a data chunk inside a file (stuff may get shifted
and will still be found by the deduplication algorithm)
Optional data encryption
All data can be protected using 256-bit AES encryption and data integrity
and authenticity is verified using HMAC-SHA256.
Off-site backups
Borg can store data on any remote host accessible over SSH. This is
most efficient if Borg is also installed on the remote host.
Backups mountable as filesystems
Backup archives are mountable as userspace filesystems for easy backup
verification and restores.
What do I need?
---------------
Borg requires Python 3.2 or above to work.
Borg also requires a sufficiently recent OpenSSL (>= 1.0.0).
In order to mount archives as filesystems, llfuse is required.
How do I install it?
--------------------
::
$ pip3 install borgbackup
Where are the docs?
-------------------
Go to https://borgbackup.github.io/ for a prebuilt version of the documentation.
You can also build it yourself from the docs folder.
Where are the tests?
--------------------
The tests are in the borg/testsuite package. To run the test suite use the
following command::
$ fakeroot -u tox # you need to have tox and pytest installed
|build| |coverage|
.. |build| image:: https://travis-ci.org/borgbackup/borg.svg
:alt: Build Status
:target: https://travis-ci.org/borgbackup/borg
.. |coverage| image:: http://codecov.io/github/borgbackup/borg/coverage.svg?branch=master
:alt: Test Coverage
:target: http://codecov.io/github/borgbackup/borg?branch=master

View File

@ -385,3 +385,22 @@ hashindex_summarize(HashIndex *index, long long *total_size, long long *total_cs
*total_unique_chunks = unique_chunks;
*total_chunks = chunks;
}
static void
hashindex_merge(HashIndex *index, HashIndex *other)
{
int32_t key_size = index->key_size;
const int32_t *other_values;
int32_t *my_values;
void *key = NULL;
while((key = hashindex_next_key(other, key))) {
other_values = key + key_size;
my_values = (int32_t *)hashindex_get(index, key);
if(my_values == NULL) {
hashindex_set(index, key, other_values);
} else {
*my_values += *other_values;
}
}
}

View File

@ -609,8 +609,9 @@ class ArchiveChecker:
self.error_found = False
self.possibly_superseded = set()
def check(self, repository, repair=False, last=None):
def check(self, repository, repair=False, archive=None, last=None):
self.report_progress('Starting archive consistency check...')
self.check_all = archive is None and last is None
self.repair = repair
self.repository = repository
self.init_chunks()
@ -619,11 +620,9 @@ class ArchiveChecker:
self.manifest = self.rebuild_manifest()
else:
self.manifest, _ = Manifest.load(repository, key=self.key)
self.rebuild_refcounts(last=last)
if last is None:
self.verify_chunks()
else:
self.report_progress('Orphaned objects check skipped (needs all archives checked)')
self.rebuild_refcounts(archive=archive, last=last)
self.orphan_chunks_check()
self.finish()
if not self.error_found:
self.report_progress('Archive consistency check complete, no problems found.')
return self.repair or not self.error_found
@ -631,7 +630,7 @@ class ArchiveChecker:
def init_chunks(self):
"""Fetch a list of all object keys from repository
"""
# Explicity set the initial hash table capacity to avoid performance issues
# Explicitly set the initial hash table capacity to avoid performance issues
# due to hash table "resonance"
capacity = int(len(self.repository) * 1.2)
self.chunks = ChunkIndex(capacity)
@ -680,7 +679,7 @@ class ArchiveChecker:
self.report_progress('Manifest rebuild complete', error=True)
return manifest
def rebuild_refcounts(self, last=None):
def rebuild_refcounts(self, archive=None, last=None):
"""Rebuild object reference counts by walking the metadata
Missing and/or incorrect data is repaired when detected
@ -762,10 +761,17 @@ class ArchiveChecker:
yield item
repository = cache_if_remote(self.repository)
num_archives = len(self.manifest.archives)
archive_items = sorted(self.manifest.archives.items(), reverse=True,
key=lambda name_info: name_info[1][b'time'])
end = None if last is None else min(num_archives, last)
if archive is None:
# we need last N or all archives
archive_items = sorted(self.manifest.archives.items(), reverse=True,
key=lambda name_info: name_info[1][b'time'])
num_archives = len(self.manifest.archives)
end = None if last is None else min(num_archives, last)
else:
# we only want one specific archive
archive_items = [item for item in self.manifest.archives.items() if item[0] == archive]
num_archives = 1
end = 1
for i, (name, info) in enumerate(archive_items[:end]):
self.report_progress('Analyzing archive {} ({}/{})'.format(name, num_archives - i, num_archives))
archive_id = info[b'id']
@ -796,16 +802,22 @@ class ArchiveChecker:
add_reference(new_archive_id, len(data), len(cdata), cdata)
info[b'id'] = new_archive_id
def verify_chunks(self):
unused = set()
for id_, (count, size, csize) in self.chunks.iteritems():
if count == 0:
unused.add(id_)
orphaned = unused - self.possibly_superseded
if orphaned:
self.report_progress('{} orphaned objects found'.format(len(orphaned)), error=True)
def orphan_chunks_check(self):
if self.check_all:
unused = set()
for id_, (count, size, csize) in self.chunks.iteritems():
if count == 0:
unused.add(id_)
orphaned = unused - self.possibly_superseded
if orphaned:
self.report_progress('{} orphaned objects found'.format(len(orphaned)), error=True)
if self.repair:
for id_ in unused:
self.repository.delete(id_)
else:
self.report_progress('Orphaned objects check skipped (needs all archives checked)')
def finish(self):
if self.repair:
for id_ in unused:
self.repository.delete(id_)
self.manifest.write()
self.repository.commit()

View File

@ -86,8 +86,9 @@ Type "Yes I am sure" if you understand this and want to continue.\n""")
print('Repository check complete, no problems found.')
else:
return 1
if not args.repo_only and not ArchiveChecker().check(repository, repair=args.repair, last=args.last):
return 1
if not args.repo_only and not ArchiveChecker().check(
repository, repair=args.repair, archive=args.repository.archive, last=args.last):
return 1
return 0
def do_change_passphrase(self, args):
@ -223,7 +224,6 @@ Type "Yes I am sure" if you understand this and want to continue.\n""")
# be restrictive when restoring files, restore permissions later
if sys.getfilesystemencoding() == 'ascii':
print('Warning: File system encoding is "ascii", extracting non-ascii filenames will not be supported.')
os.umask(0o077)
repository = self.open_repository(args.archive)
manifest, key = Manifest.load(repository)
archive = Archive(repository, key, manifest, args.archive.archive,
@ -513,7 +513,12 @@ Type "Yes I am sure" if you understand this and want to continue.\n""")
common_parser.add_argument('-v', '--verbose', dest='verbose', action='store_true',
default=False,
help='verbose output')
common_parser.add_argument('--no-files-cache', dest='cache_files', action='store_false')
common_parser.add_argument('--no-files-cache', dest='cache_files', action='store_false',
help='do not load/update the file metadata cache used to detect unchanged files')
common_parser.add_argument('--umask', dest='umask', type=lambda s: int(s, 8), default=0o077, metavar='M',
help='set umask to M (local and remote, default: 0o077)')
common_parser.add_argument('--remote-path', dest='remote_path', default='borg', metavar='PATH',
help='set remote path to executable (default: "borg")')
# We can't use argparse for "serve" since we don't want it to show up in "Available commands"
if args:
@ -535,6 +540,8 @@ Type "Yes I am sure" if you understand this and want to continue.\n""")
This command initializes an empty repository. A repository is a filesystem
directory containing the deduplicated data from zero or more archives.
Encryption can be enabled at repository init time.
Please note that the 'passphrase' encryption mode is DEPRECATED (instead of it,
consider using 'repokey').
""")
subparser = subparsers.add_parser('init', parents=[common_parser],
description=self.do_init.__doc__, epilog=init_epilog,
@ -544,27 +551,51 @@ Type "Yes I am sure" if you understand this and want to continue.\n""")
type=location_validator(archive=False),
help='repository to create')
subparser.add_argument('-e', '--encryption', dest='encryption',
choices=('none', 'passphrase', 'keyfile', 'repokey'), default='none',
help='select encryption method')
choices=('none', 'keyfile', 'repokey', 'passphrase'), default='none',
help='select encryption key mode')
check_epilog = textwrap.dedent("""
The check command verifies the consistency of a repository and the corresponding
archives. The underlying repository data files are first checked to detect bit rot
and other types of damage. After that the consistency and correctness of the archive
metadata is verified.
The check command verifies the consistency of a repository and the corresponding archives.
The archive metadata checks can be time consuming and requires access to the key
file and/or passphrase if encryption is enabled. These checks can be skipped using
the --repository-only option.
First, the underlying repository data files are checked:
- For all segments the segment magic (header) is checked
- For all objects stored in the segments, all metadata (e.g. crc and size) and
all data is read. The read data is checked by size and CRC. Bit rot and other
types of accidental damage can be detected this way.
- If we are in repair mode and a integrity error is detected for a segment,
we try to recover as many objects from the segment as possible.
- In repair mode, it makes sure that the index is consistent with the data
stored in the segments.
- If you use a remote repo server via ssh:, the repo check is executed on the
repo server without causing significant network traffic.
- The repository check can be skipped using the --archives-only option.
Second, the consistency and correctness of the archive metadata is verified:
- Is the repo manifest present? If not, it is rebuilt from archive metadata
chunks (this requires reading and decrypting of all metadata and data).
- Check if archive metadata chunk is present. if not, remove archive from
manifest.
- For all files (items) in the archive, for all chunks referenced by these
files, check if chunk is present (if not and we are in repair mode, replace
it with a same-size chunk of zeros). This requires reading of archive and
file metadata, but not data.
- If we are in repair mode and we checked all the archives: delete orphaned
chunks from the repo.
- if you use a remote repo server via ssh:, the archive check is executed on
the client machine (because if encryption is enabled, the checks will require
decryption and this is always done client-side, because key access will be
required).
- The archive checks can be time consuming, they can be skipped using the
--repository-only option.
""")
subparser = subparsers.add_parser('check', parents=[common_parser],
description=self.do_check.__doc__,
epilog=check_epilog,
formatter_class=argparse.RawDescriptionHelpFormatter)
subparser.set_defaults(func=self.do_check)
subparser.add_argument('repository', metavar='REPOSITORY',
type=location_validator(archive=False),
help='repository to check consistency of')
subparser.add_argument('repository', metavar='REPOSITORY_OR_ARCHIVE',
type=location_validator(),
help='repository or archive to check consistency of')
subparser.add_argument('--repository-only', dest='repo_only', action='store_true',
default=False,
help='only perform repository checks')
@ -833,6 +864,9 @@ Type "Yes I am sure" if you understand this and want to continue.\n""")
args = parser.parse_args(args or ['-h'])
self.verbose = args.verbose
os.umask(args.umask)
RemoteRepository.remote_path = args.remote_path
RemoteRepository.umask = args.umask
update_excludes(args)
return args.func(args)

View File

@ -306,11 +306,14 @@ class Cache:
chunk_idx.clear()
for tarinfo in tf_in:
archive_id_hex = tarinfo.name
archive_name = tarinfo.pax_headers['archive_name']
print("- extracting archive %s ..." % archive_name)
tf_in.extract(archive_id_hex, tmp_dir)
chunk_idx_path = os.path.join(tmp_dir, archive_id_hex).encode('utf-8')
print("- reading archive ...")
archive_chunk_idx = ChunkIndex.read(chunk_idx_path)
for chunk_id, (count, size, csize) in archive_chunk_idx.iteritems():
add(chunk_idx, chunk_id, size, csize, incr=count)
print("- merging archive ...")
chunk_idx.merge(archive_chunk_idx)
os.unlink(chunk_idx_path)
self.begin_txn()

View File

@ -14,6 +14,7 @@ cdef extern from "_hashindex.c":
void hashindex_summarize(HashIndex *index, long long *total_size, long long *total_csize,
long long *unique_size, long long *unique_csize,
long long *total_unique_chunks, long long *total_chunks)
void hashindex_merge(HashIndex *index, HashIndex *other)
int hashindex_get_size(HashIndex *index)
int hashindex_write(HashIndex *index, char *path)
void *hashindex_get(HashIndex *index, void *key)
@ -24,15 +25,18 @@ cdef extern from "_hashindex.c":
int _le32toh(int v)
_NoDefault = object()
cdef _NoDefault = object()
cimport cython
@cython.internal
cdef class IndexBase:
cdef HashIndex *index
key_size = 32
def __cinit__(self, capacity=0, path=None):
if path:
self.index = hashindex_read(<bytes>os.fsencode(path))
self.index = hashindex_read(os.fsencode(path))
if not self.index:
raise Exception('hashindex_read failed')
else:
@ -49,7 +53,7 @@ cdef class IndexBase:
return cls(path=path)
def write(self, path):
if not hashindex_write(self.index, <bytes>os.fsencode(path)):
if not hashindex_write(self.index, os.fsencode(path)):
raise Exception('hashindex_write failed')
def clear(self):
@ -187,6 +191,9 @@ cdef class ChunkIndex(IndexBase):
&total_unique_chunks, &total_chunks)
return total_size, total_csize, unique_size, unique_csize, total_unique_chunks, total_chunks
def merge(self, ChunkIndex other):
hashindex_merge(self.index, other.index)
cdef class ChunkKeyIterator:
cdef ChunkIndex idx

View File

@ -108,9 +108,10 @@ class RepositoryServer:
class RemoteRepository:
extra_test_args = []
remote_path = None
umask = None
class RPCError(Exception):
def __init__(self, name):
self.name = name
@ -124,8 +125,10 @@ class RemoteRepository:
self.responses = {}
self.unpacker = msgpack.Unpacker(use_list=False)
self.p = None
# use local umask also for the remote process
umask = ['--umask', '%03o' % self.umask]
if location.host == '__testsuite__':
args = [sys.executable, '-m', 'borg.archiver', 'serve'] + self.extra_test_args
args = [sys.executable, '-m', 'borg.archiver', 'serve'] + umask + self.extra_test_args
else:
args = ['ssh']
if location.port:
@ -134,7 +137,7 @@ class RemoteRepository:
args.append('%s@%s' % (location.user, location.host))
else:
args.append('%s' % location.host)
args += ['borg', 'serve']
args += [self.remote_path, 'serve'] + umask
self.p = Popen(args, bufsize=0, stdin=PIPE, stdout=PIPE)
self.stdin_fd = self.p.stdin.fileno()
self.stdout_fd = self.p.stdout.fileno()

View File

@ -73,7 +73,7 @@ class BaseTestCase(unittest.TestCase):
d1 = [filename] + [getattr(s1, a) for a in attrs]
d2 = [filename] + [getattr(s2, a) for a in attrs]
if not os.path.islink(path1) or utime_supports_fd:
# Older versions of llfuse does not support ns precision properly
# Older versions of llfuse do not support ns precision properly
if fuse and not have_fuse_mtime_ns:
d1.append(round(st_mtime_ns(s1), -4))
d2.append(round(st_mtime_ns(s2), -4))
@ -94,28 +94,3 @@ class BaseTestCase(unittest.TestCase):
return
time.sleep(.1)
raise Exception('wait_for_mount(%s) timeout' % path)
def get_tests(suite):
"""Generates a sequence of tests from a test suite
"""
for item in suite:
try:
# TODO: This could be "yield from..." with Python 3.3+
for i in get_tests(item):
yield i
except TypeError:
yield item
class TestLoader(unittest.TestLoader):
"""A customized test loader that properly detects and filters our test cases
"""
def loadTestsFromName(self, pattern, module=None):
suite = self.discover('borg.testsuite', '*.py')
tests = unittest.TestSuite()
for test in get_tests(suite):
if pattern.lower() in test.id().lower():
tests.addTest(test)
return tests

View File

@ -1,12 +1,12 @@
from datetime import datetime, timezone
import msgpack
from mock import Mock
from ..archive import Archive, CacheChunkBuffer, RobustUnpacker
from ..key import PlaintextKey
from ..helpers import Manifest
from . import BaseTestCase
from .mock import Mock
class MockCache:

View File

@ -11,6 +11,8 @@ import time
import unittest
from hashlib import sha256
from mock import patch
from .. import xattr
from ..archive import Archive, ChunkBuffer, CHUNK_MAX_EXP
from ..archiver import Archiver
@ -20,7 +22,6 @@ from ..helpers import Manifest
from ..remote import RemoteRepository, PathNotAllowed
from ..repository import Repository
from . import BaseTestCase
from .mock import patch
try:
import llfuse
@ -243,6 +244,19 @@ class ArchiverTestCase(ArchiverTestCaseBase):
if sparse_support and hasattr(st, 'st_blocks'):
self.assert_true(st.st_blocks * 512 < total_len / 10) # is output sparse?
def test_unusual_filenames(self):
filenames = ['normal', 'with some blanks', '(with_parens)', ]
for filename in filenames:
filename = os.path.join(self.input_path, filename)
with open(filename, 'wb') as fd:
pass
self.cmd('init', self.repository_location)
self.cmd('create', self.repository_location + '::test', 'input')
for filename in filenames:
with changedir('output'):
self.cmd('extract', self.repository_location + '::test', os.path.join('input', filename))
assert os.path.exists(os.path.join('output', 'input', filename))
def test_repository_swap_detection(self):
self.create_test_files()
os.environ['BORG_PASSPHRASE'] = 'passphrase'
@ -425,6 +439,13 @@ class ArchiverTestCase(ArchiverTestCaseBase):
# Restore permissions so shutil.rmtree is able to delete it
os.system('chmod -R u+w ' + self.repository_path)
def test_umask(self):
self.create_regular_file('file1', size=1024 * 80)
self.cmd('init', self.repository_location)
self.cmd('create', self.repository_location + '::test', 'input')
mode = os.stat(self.repository_path).st_mode
self.assertEqual(stat.S_IMODE(mode), 0o700)
def test_cmdline_compatibility(self):
self.create_regular_file('file1', size=1024 * 80)
self.cmd('init', self.repository_location)

View File

@ -6,6 +6,11 @@ from ..hashindex import NSIndex, ChunkIndex
from . import BaseTestCase
def H(x):
# make some 32byte long thing that depends on x
return bytes('%-0.32d' % x, 'ascii')
class HashIndexTestCase(BaseTestCase):
def _generic_test(self, cls, make_value, sha):
@ -78,3 +83,20 @@ class HashIndexTestCase(BaseTestCase):
second_half = list(idx.iteritems(marker=all[49][0]))
self.assert_equal(len(second_half), 50)
self.assert_equal(second_half, all[50:])
def test_chunkindex_merge(self):
idx1 = ChunkIndex()
idx1[H(1)] = 1, 100, 100
idx1[H(2)] = 2, 200, 200
idx1[H(3)] = 3, 300, 300
# no H(4) entry
idx2 = ChunkIndex()
idx2[H(1)] = 4, 100, 100
idx2[H(2)] = 5, 200, 200
# no H(3) entry
idx2[H(4)] = 6, 400, 400
idx1.merge(idx2)
assert idx1[H(1)] == (5, 100, 100)
assert idx1[H(2)] == (7, 200, 200)
assert idx1[H(3)] == (3, 300, 300)
assert idx1[H(4)] == (6, 400, 400)

View File

@ -1,14 +0,0 @@
"""
Mocking
Note: unittest.mock is broken on at least python 3.3.6 and 3.4.0.
it silently ignores mistyped method names starting with assert_...,
does nothing and just succeeds.
The issue was fixed in the separately distributed "mock" lib, you
get an AttributeError there. So, always use that one!
Details:
http://engineeringblog.yelp.com/2015/02/assert_called_once-threat-or-menace.html
"""
from mock import *

View File

@ -2,13 +2,14 @@ import os
import shutil
import tempfile
from mock import patch
from ..hashindex import NSIndex
from ..helpers import Location, IntegrityError
from ..locking import UpgradableLock
from ..remote import RemoteRepository, InvalidRPCMethod
from ..repository import Repository
from . import BaseTestCase
from .mock import patch
class RepositoryTestCaseBase(BaseTestCase):

View File

@ -1,11 +0,0 @@
import unittest
from . import TestLoader
def main():
unittest.main(testLoader=TestLoader(), defaultTest='')
if __name__ == '__main__':
main()

View File

@ -5,6 +5,8 @@
<ul>
<li><a href="https://borgbackup.github.io/borgbackup/">Main Web Site</a></li>
<li><a href="https://pypi.python.org/pypi/borgbackup">PyPI packages</a></li>
<li><a href="https://github.com/borgbackup/borg/issues/147">Binary Packages</a></li>
<li><a href="https://github.com/borgbackup/borg/blob/master/CHANGES.rst">Current ChangeLog</a></li>
<li><a href="https://github.com/borgbackup/borg">GitHub</a></li>
<li><a href="https://github.com/borgbackup/borg/issues">Issue Tracker</a></li>
<li><a href="https://www.bountysource.com/teams/borgbackup">Bounties &amp; Fundraisers</a></li>

4
docs/changes.rst Normal file
View File

@ -0,0 +1,4 @@
.. include:: global.rst.inc
.. _changelog:
.. include:: ../CHANGES.rst

View File

@ -11,13 +11,13 @@
# All configuration values have a default; values that are commented out
# serve to show the default.
from borg import __version__ as sw_version
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#import sys, os
#sys.path.insert(0, os.path.abspath('.'))
import sys, os
sys.path.insert(0, os.path.abspath('..'))
from borg import __version__ as sw_version
# -- General configuration -----------------------------------------------------
@ -42,7 +42,7 @@ master_doc = 'index'
# General information about the project.
project = 'Borg - Deduplicating Archiver'
copyright = '2010-2014, Jonas Borgström'
copyright = '2010-2014, Jonas Borgström, 2015 The Borg Collective (see AUTHORS file)'
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
@ -134,7 +134,7 @@ html_static_path = []
# Custom sidebar templates, maps document names to template names.
html_sidebars = {
'index': ['sidebarlogo.html', 'sidebarusefullinks.html', 'searchbox.html'],
'**': ['sidebarlogo.html', 'localtoc.html', 'relations.html', 'sidebarusefullinks.html', 'searchbox.html']
'**': ['sidebarlogo.html', 'relations.html', 'searchbox.html', 'localtoc.html', 'sidebarusefullinks.html']
}
# Additional templates that should be rendered to pages, maps page names to
# template names.

67
docs/development.rst Normal file
View File

@ -0,0 +1,67 @@
.. include:: global.rst.inc
.. _development:
Development
===========
This chapter will get you started with |project_name|' development.
|project_name| is written in Python (with a little bit of Cython and C for
the performance critical parts).
Building a development environment
----------------------------------
First, just install borg into a virtual env as described before.
To install some additional packages needed for running the tests, activate your
virtual env and run::
pip install -r requirements.d/development.txt
Running the tests
-----------------
The tests are in the borg/testsuite package.
To run them, you need to have fakeroot, tox and pytest installed.
To run the test suite use the following command::
fakeroot -u tox # run all tests
Some more advanced examples::
# verify a changed tox.ini (run this after any change to tox.ini):
fakeroot -u tox --recreate
fakeroot -u tox -e py32 # run all tests, but only on python 3.2
fakeroot -u tox borg.testsuite.locking # only run 1 test module
fakeroot -u tox borg.testsuite.locking -- -k '"not Timer"' # exclude some tests
fakeroot -u tox borg.testsuite -- -v # verbose py.test
Important notes:
- Without fakeroot -u some tests will fail.
- When using -- to give options to py.test, you MUST also give borg.testsuite[.module].
Building the docs with Sphinx
-----------------------------
The documentation (in reStructuredText format, .rst) is in docs/.
To build the html version of it, you need to have sphinx installed::
pip3 install sphinx
Now run::
cd docs/
make html
Then point a web browser at docs/_build/html/index.html.

View File

@ -1,65 +0,0 @@
.. include:: global.rst.inc
.. _foreword:
Foreword
========
|project_name| is a secure backup program for Linux, FreeBSD and Mac OS X.
|project_name| is designed for efficient data storage where only new or
modified data is stored.
Features
--------
Space efficient storage
Variable block size `deduplication`_ is used to reduce the number of bytes
stored by detecting redundant data. Each file is split into a number of
variable length chunks and only chunks that have never been seen before
are added to the repository (and optionally compressed).
Optional data encryption
All data can be protected using 256-bit AES_ encryption and data integrity
and authenticity is verified using `HMAC-SHA256`_.
Off-site backups
|project_name| can store data on any remote host accessible over SSH as
long as |project_name| is installed. If you don't have |project_name|
installed there, you can use some network filesytem (sshfs, nfs, ...)
to mount a filesystem located on your remote host and use it like it was
local (but that will be slower).
Backups mountable as filesystems
Backup archives are :ref:`mountable <borg_mount>` as
`userspace filesystems`_ for easy backup verification and restores.
Glossary
--------
.. _deduplication_def:
Deduplication
Deduplication is a technique for improving storage utilization by
eliminating redundant data.
.. _archive_def:
Archive
An archive is a collection of files along with metadata that include file
permissions, directory structure and various file attributes.
Since each archive in a repository must have a unique name a good naming
convention is ``hostname-YYYY-MM-DD``.
.. _repository_def:
Repository
A repository is a filesystem directory storing data from zero or more
archives. The data in a repository is both deduplicated and
optionally encrypted making it both efficient and safe. Repositories are
created using :ref:`borg_init` and the contents can be listed using
:ref:`borg_list`.
Key file
When a repository is initialized a key file containing a password
protected encryption key is created. It is vital to keep this file safe
since the repository data is totally inaccessible without it.

View File

@ -1,81 +1,18 @@
.. include:: global.rst.inc
Welcome to Borg
================
|project_name| is a deduplicating backup program.
Optionally, it also supports compression and authenticated encryption.
The main goal of |project_name| is to provide an efficient and secure way
to backup data. The data deduplication technique used makes |project_name|
suitable for daily backups since only the changes are stored. The authenticated
encryption makes it suitable for backups to not fully trusted targets.
|project_name| is written in Python (with a little bit of Cython and C for
the performance critical parts).
Easy to use
-----------
Initialize a new backup :ref:`repository <repository_def>` and create your
first backup :ref:`archive <archive_def>` in two lines::
$ borg init /mnt/backup
$ borg create /mnt/backup::Monday ~/Documents
$ borg create --stats /mnt/backup::Tuesday ~/Documents
Archive name: Tuesday
Archive fingerprint: 387a5e3f9b0e792e91ce87134b0f4bfe17677d9248cb5337f3fbf3a8e157942a
Start time: Tue Mar 25 12:00:10 2014
End time: Tue Mar 25 12:00:10 2014
Duration: 0.08 seconds
Number of files: 358
Original size Compressed size Deduplicated size
This archive: 57.16 MB 46.78 MB 151.67 kB
All archives: 114.02 MB 93.46 MB 44.81 MB
See the :ref:`quickstart` chapter for a more detailed example.
Easy installation
-----------------
You can use pip to install |project_name| quickly and easily::
$ pip3 install borgbackup
Need more help with installing? See :ref:`installation`.
User's Guide
============
Borg Documentation
==================
.. toctree::
:maxdepth: 2
foreword
intro
installation
quickstart
usage
faq
support
changes
internals
Getting help
============
If you've found a bug or have a concrete feature request, please create a new
ticket on the project's `issue tracker`_ (after checking whether someone else
already has reported the same thing).
For more general questions or discussions, IRC or mailing list are preferred.
IRC
---
Join us on channel #borgbackup on chat.freenode.net. As usual on IRC, just
ask or tell directly and then patiently wait for replies. Stay connected.
Mailing list
------------
There is a mailing list for Borg on librelist_ that you can use for feature
requests and general discussions about Borg. A mailing list archive is
available `here <http://librelist.com/browser/borgbackup/>`_.
To subscribe to the list, send an email to borgbackup@librelist.com and reply
to the confirmation mail. Likewise, to unsubscribe, send an email to
borgbackup-unsubscribe@librelist.com and reply to the confirmation mail.
development

7
docs/intro.rst Normal file
View File

@ -0,0 +1,7 @@
.. include:: global.rst.inc
.. _foreword:
Introduction
============
.. include:: ../README.rst

34
docs/support.rst Normal file
View File

@ -0,0 +1,34 @@
.. include:: global.rst.inc
.. _support:
Support
=======
Issue Tracker
-------------
If you've found a bug or have a concrete feature request, please create a new
ticket on the project's `issue tracker`_ (after checking whether someone else
already has reported the same thing).
For more general questions or discussions, IRC or mailing list are preferred.
IRC
---
Join us on channel #borgbackup on chat.freenode.net.
As usual on IRC, just ask or tell directly and then patiently wait for replies.
Stay connected.
Mailing list
------------
There is a mailing list for Borg on librelist_ that you can use for feature
requests and general discussions about Borg. A mailing list archive is
available `here <http://librelist.com/browser/borgbackup/>`_.
To subscribe to the list, send an email to borgbackup@librelist.com and reply
to the confirmation mail.
To unsubscribe, send an email to borgbackup-unsubscribe@librelist.com and reply
to the confirmation mail.

View File

@ -1,4 +1,5 @@
tox
mock
pytest
pytest-cov<2.0.0
Cython

13
tox.ini
View File

@ -1,16 +1,5 @@
# tox configuration - if you change anything here, run this to verify:
# fakeroot -u tox --recreate
#
# Invokation examples:
# fakeroot -u tox # run all tests
# fakeroot -u tox -e py32 # run all tests, but only on python 3.2
# fakeroot -u tox borg.testsuite.locking # only run 1 test module
# fakeroot -u tox borg.testsuite.locking -- -k '"not Timer"' # exclude some tests
# fakeroot -u tox borg.testsuite -- -v # verbose py.test
#
# Important notes:
# Without fakeroot -u some tests will fail.
# When using -- to give options to py.test, you MUST also give borg.testsuite[.module].
[tox]
envlist = py32, py33, py34
@ -20,6 +9,6 @@ envlist = py32, py33, py34
# not really matter, should be just different from the toplevel dir.
changedir = {toxworkdir}
deps = -rrequirements.d/development.txt
commands = py.test --pyargs {posargs:borg.testsuite}
commands = py.test --cov=borg --pyargs {posargs:borg.testsuite}
# fakeroot -u needs some env vars:
passenv = *