It was only used in two places:
- stats: apparently as a minor performance optimization, which is
unlikely to be important
- find: filtered directories would be ignored. However, this
optimization missed that it is possible that two directories have the
exact same content. Such directories would be incorrectly ignored too.
Example:
```
mkdir test test/a test/b
restic backup test
restic find latest test/b
-> incorrectly does not return anything
```
Thus, remove the functionality as it's apparently too complex to use
correctly.
This adds support for caching already rewritten trees, handling of load
errors and disabling the check that the serialization doesn't lead to
data loss.
The more generic RewriteNode callback replaces the SelectByName and
PrintExclude functions. The main part of this change is a preparation to
allow using the TreeRewriter for the `repair snapshots` command.
In principle, the JSON format of Tree objects is extensible without
requiring a format change. In order to not loose information just play
it safe and reject rewriting trees for which we could loose data.
Use runtime.GOMAXPROCS(0) as worker count for CPU-bound tasks,
repo.Connections() for IO-bound task and a combination if a task can be
both. Streaming packs is treated as IO-bound as adding more worker
cannot provide a speedup.
Typical IO-bound tasks are download / uploading / deleting files.
Decoding / Encoding / Verifying are usually CPU-bound. Several tasks are
a combination of both, e.g. for combined download and decode functions.
In the latter case add both limits together. As the backends have their
own concurrency limits restic still won't download more than
repo.Connections() files in parallel, but the additional workers can
decode already downloaded data in parallel.
Load tree blobs with more than 50MB only from a single goroutine. Very
large tree blobs with for example 400 MB size can otherwise require
roughly 1GB * streamTreeParallelism memory.