mirror of https://github.com/borgbackup/borg.git
implement BORG_FILES_CACHE_TTL, update FAQ
raise default ttl to 20 (previously: 10).
This commit is contained in:
parent
2e969a7418
commit
5f7b466969
|
@ -193,12 +193,13 @@ Chunk index: {0.total_unique_chunks:20d} {0.total_chunks:20d}"""
|
||||||
if not self.txn_active:
|
if not self.txn_active:
|
||||||
return
|
return
|
||||||
if self.files is not None:
|
if self.files is not None:
|
||||||
|
ttl = int(os.environ.get('BORG_FILES_CACHE_TTL', 20))
|
||||||
with open(os.path.join(self.path, 'files'), 'wb') as fd:
|
with open(os.path.join(self.path, 'files'), 'wb') as fd:
|
||||||
for path_hash, item in self.files.items():
|
for path_hash, item in self.files.items():
|
||||||
# Discard cached files with the newest mtime to avoid
|
# Discard cached files with the newest mtime to avoid
|
||||||
# issues with filesystem snapshots and mtime precision
|
# issues with filesystem snapshots and mtime precision
|
||||||
item = msgpack.unpackb(item)
|
item = msgpack.unpackb(item)
|
||||||
if item[0] < 10 and bigint_to_int(item[3]) < self._newest_mtime:
|
if item[0] < ttl and bigint_to_int(item[3]) < self._newest_mtime:
|
||||||
msgpack.pack((path_hash, item), fd)
|
msgpack.pack((path_hash, item), fd)
|
||||||
self.config.set('cache', 'manifest', hexlify(self.manifest.id).decode('ascii'))
|
self.config.set('cache', 'manifest', hexlify(self.manifest.id).decode('ascii'))
|
||||||
self.config.set('cache', 'timestamp', self.manifest.timestamp)
|
self.config.set('cache', 'timestamp', self.manifest.timestamp)
|
||||||
|
|
24
docs/faq.rst
24
docs/faq.rst
|
@ -345,6 +345,30 @@ those files are reported as being added when, really, chunks are
|
||||||
already used.
|
already used.
|
||||||
|
|
||||||
|
|
||||||
|
It always chunks all my files, even unchanged ones!
|
||||||
|
---------------------------------------------------
|
||||||
|
|
||||||
|
|project_name| maintains a files cache where it remembers the mtime, size and
|
||||||
|
inode of files. When |project_name| does a new backup and starts processing a
|
||||||
|
file, it first looks whether the file has changed (compared to the values
|
||||||
|
stored in the files cache). If the values are the same, the file is assumed
|
||||||
|
unchanged and thus its contents won't get chunked (again).
|
||||||
|
|
||||||
|
|project_name| can't keep an infinite history of files of course, thus entries
|
||||||
|
in the files cache have a "maximum time to live" which is set via the
|
||||||
|
environment variable BORG_FILES_CACHE_TTL (and defaults to 20).
|
||||||
|
Every time you do a backup (on the same machine, using the same user), the
|
||||||
|
cache entries' ttl values of files that were not "seen" are incremented by 1
|
||||||
|
and if they reach BORG_FILES_CACHE_TTL, the entry is removed from the cache.
|
||||||
|
|
||||||
|
So, for example, if you do daily backups of 26 different data sets A, B,
|
||||||
|
C, ..., Z on one machine (using the default TTL), the files from A will be
|
||||||
|
already forgotten when you repeat the same backups on the next day and it
|
||||||
|
will be slow because it would chunk all the files each time. If you set
|
||||||
|
BORG_FILES_CACHE_TTL to at least 26 (or maybe even a small multiple of that),
|
||||||
|
it would be much faster.
|
||||||
|
|
||||||
|
|
||||||
Is there a way to limit bandwidth with |project_name|?
|
Is there a way to limit bandwidth with |project_name|?
|
||||||
------------------------------------------------------
|
------------------------------------------------------
|
||||||
|
|
||||||
|
|
|
@ -86,6 +86,9 @@ General:
|
||||||
BORG_REMOTE_PATH
|
BORG_REMOTE_PATH
|
||||||
When set, use the given path/filename as remote path (default is "borg").
|
When set, use the given path/filename as remote path (default is "borg").
|
||||||
Using ``--remote-path PATH`` commandline option overrides the environment variable.
|
Using ``--remote-path PATH`` commandline option overrides the environment variable.
|
||||||
|
BORG_FILES_CACHE_TTL
|
||||||
|
When set to a numeric value, this determines the maximum "time to live" for the files cache
|
||||||
|
entries (default: 20). The files cache is used to quickly determine whether a file is unchanged.
|
||||||
TMPDIR
|
TMPDIR
|
||||||
where temporary files are stored (might need a lot of temporary space for some operations)
|
where temporary files are stored (might need a lot of temporary space for some operations)
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue