diff --git a/docs/internals/data-structures.rst b/docs/internals/data-structures.rst
index bff045666..08b0b84d9 100644
--- a/docs/internals/data-structures.rst
+++ b/docs/internals/data-structures.rst
@@ -626,7 +626,11 @@ The idea of content-defined chunking is assigning every byte where a
 cut *could* be placed a hash. The hash is based on some number of bytes
 (the window size) before the byte in question. Chunks are cut
 where the hash satisfies some condition
-(usually "n numbers of trailing/leading zeroes").
+(usually "n numbers of trailing/leading zeroes"). This causes chunks to be cut
+in the same location relative to the file's contents, even if bytes are inserted
+or removed before/after a cut, as long as the bytes within the window stay the same.
+This results in a high chance that a single cluster of changes to a file will only
+result in 1-2 new chunks, aiding deduplication.
 
 Using normal hash functions this would be extremely slow,
 requiring hashing ``window size * file size`` bytes.