the previous check only checked that we got a dict, but did not validate the dict keys.
this triggered issues with e.g. (invalid) integer keys.
now it validates the keys:
- some required keys must be present
- the set of keys is a subset of all valid keys
we need a list of valid item metadata keys. using a list stored in the repo manifest
is more future-proof than the hardcoded ITEM_KEYS in the source code.
keys that are in union(item_keys_from_repo, item_keys_from_source) are considered valid.
when trying to resync and skip invalid data, borg tries to qualify a byte sequence as
valid-looking msgpacked item metadata dict (or not) before even invoking msgpack's unpack.
besides previously hard to understand code, there were 2 issues:
- a missing check for map16 - this type is what msgpack uses if the dict has more than
15 items (could happen in future, not for 1.0.x).
- missing checks for str8/16/32 - str16 is what msgpack uses if the bytestring has more than 31 bytes
(borg does not have that long key names, thus this wasn't causing any harm)
this misqualification (valid data considered invalid) could lead to a wrong resync, skipping valid items.
added more comments and tests.
They are extracted correctly, for a little while at least, since chown()
*resets* all capabilities on the chowned file. Which I find curious,
since chown() is a privileged syscall. Probably a safeguard for
sysadmins who are unaware of capabilities.
The solution is to set the xattrs last, after chown()ing files.
Despite what the man page says, Linux does not discard the initial
partial page only. The ending page would be truncated no matter if
it is partial or not.
Page-align the fadvise size to take care of this.
Also while we are at it, roll back initial fadvise offset to the
previous page boundary to actually throw away that page as we
no longer need it having read the second part now and the first
time in the previous call.
This patch has a noticeable impact in my Linux testing when the file
is on the rotating media. The total test runtime decreased by a bit
over 10%, but since over half of that time was actually cpu time,
the actual iowait time decreased around 20%.