the items metadata stream is usually not that big (compared to the file content data) -
it is just file and dir names and other metadata.
if we use too rough granularity there (and big minimum chunk size), we usually will get no deduplication.
The existing option to exclude files and directories, “--exclude”, is
implemented using fnmatch[1]. fnmatch matches the slash (“/”) with “*”
and thus makes it impossible to write patterns where a directory with
a given name should be excluded at a specific depth in the directory
hierarchy, but not anywhere else. Consider this structure:
home/
home/aaa
home/aaa/.thumbnails
home/user
home/user/img
home/user/img/.thumbnails
fnmatch incorrectly excludes “home/user/img/.thumbnails” with a pattern
of “home/*/.thumbnails” when the intention is to exclude “.thumbnails”
in all home directories while retaining directories with the same name
in all other locations.
With this change regular expressions are introduced as an additional
pattern syntax. The syntax is selected using a prefix on “--exclude”'s
value. “re:” is for regular expression and “fm:”, the default, selects
fnmatch. Selecting the syntax is necessary when regular expressions are
desired or when the desired fnmatch pattern starts with two alphanumeric
characters followed by a colon (i.e. “aa:something/*”). The exclusion
described above can be implemented as follows:
--exclude 're:^home/[^/]+/\.thumbnails$'
The “--exclude-from” option permits loading exclusions from a text file
where the same prefixes can now be used, e.g. “re:\.tmp$”.
The documentation has been extended and now not only describes the two
pattern styles, but also the file format supported by “--exclude-from”.
This change has been discussed in issue #43 and in change request #497.
[1] https://docs.python.org/3/library/fnmatch.html
Signed-off-by: Michael Hanselmann <public@hansmi.ch>
removed --log-level due to overlap with how --verbose works now.
for consistency, added --info as alias to --verbose (as the effect is
setting INFO log level).
also added --debug which sets DEBUG log level.
note: there are no messages emitted at DEBUG level yet.
WARNING is the default (because we want mostly silent behaviour,
except if something serious happens), so we don't need --warning
as an option.
the problem here was that we do not just have changed and unchanged items,
but also a lot of items besides regular files which we just back up "as is" without
determining whether they are changed or not. thus, we can't support changed/unchanged
in a way users would expect them to work.
the A/M/U status only applies to the data content of regular files (compared to the index).
for all items, we ALWAYS save the metadata, there is no changed / not changed detection there.
thus, I replaced this with a --filter option where you can just specify which
status chars you want to see listed in the output.
E.g. --filter AM will only show regular files with A(dded) or M(odified) state, but nothing else.
Not giving --filter defaults to showing all items no matter what status they have.
Output is emitted via logger at info level, so it won't show up except if the logger is at that level.