the chunk accounting code tried to reflect repo space usage via the st_blocks of the files.
so, a specific chunk that was shared between multiple files [inodes] was only accounted for one specific file.
thus, the overall "du" of everything in the fuse mounted repo was maybe correctly reflecting the repo space usage,
but the decision which file has the chunk (the space) was kind of arbitrary and not really useful.
otoh, a simple fuse getattr() was rather expensive due to this as it needed to iterate over the chunks list
to compute the st_blocks value. also it needed quite some memory for the accounting.
thus, st_blocks is now just ceil(size / blocksize).
also: fixed bug that st_blocks was a floating point value previously.
also: preparing for further optimization of size computation (see next cs)
if an item has a chunk list, pre-compute the total size and store it into "size" metadata entry.
this speeds up access to item size (e.g. for regular files) and could also be used to verify the validity of the chunks list.
note about hardlinks: size is only stored for hardlink masters (only they have an own chunk list)
See #1452
This is 100 % accurate.
Also increases maximum data size by ~41 bytes. Not 100 % side-effect free;
if you manage to exactly land in that area then older Borg would not read
it. OTOH it gives us a nice round number there.
also: add some missing assertion messages
severity:
- no issue on little-endian platforms (== most, including x86/x64)
- harmless even on big-endian as long as refcount is below 0xfffbffff,
which is very likely always the case in practice anyway.
we do not trust the remote, so we are careful unpacking its responses.
the remote could return manipulated msgpack data that announces e.g.
a huge array or map or string. the local would then need to allocate huge
amounts of RAM in expectation of that data (no matter whether really
that much is coming or not).
by using limits in the Unpacker, a ValueError will be raised if unexpected
amounts of data shall get unpacked. memory DoS will be avoided.
The former section is a bit older (Nov 2016) and has been the piece
responsible for finding CVE-2016-10099, since while writing it I
wondered how the manifest was authenticated to actually
*be* the manifest. Well. There it is ;)
It has been edited to final form only recently and should now be ready
for review.
The latter section is new.