_hashindex: easier to understand code, dubious loops removed, asserts

hashindex_index returns the perfect hashtable index, but does not check what's in the bucket there, so we had these loops afterwards to search for an empty or deleted bucket. problem: if the HT were completely filled with no empty and no deleted buckets, that loop would never end. due to our HT resizing, it can never happen, but still not pretty. when using hashindex_lookup (as also used some lines above), the code is easier to understand, because (after we resized the HT), we freshly create the same situation as after the first call of that function: - return value < 0, because we (still) can not find the key - start_idx will point to an empty bucket Thus, we do not need the problematic loops we had there. Modified the checks to make sure we really have an empty or deleted bucket before overwriting it with data. Added some additional asserts to make sure the code behaves.
2023-02-08 04:40:27 +01:00 · 2023-02-08 04:40:27 +01:00 · c9573c04ac
parent 978cb163e6
commit c9573c04ac
1 changed files with 16 additions and 15 deletions
--- a/src/borg/_hashindex.c
+++ b/src/borg/_hashindex.c
@ -731,23 +731,23 @@ static int
 hashindex_set(HashIndex *index, const unsigned char *key, const void *value)
 {
    int start_idx;
-    int idx = hashindex_lookup(index, key, &start_idx);
+    int idx = hashindex_lookup(index, key, &start_idx);  /* if idx < 0: start_idx -> EMPTY or DELETED */
    uint8_t *ptr;
    if(idx < 0)
    {
        if(index->num_entries > index->upper_limit) {
+            /* hashtable too full, grow it! */
            if(!hashindex_resize(index, grow_size(index->num_buckets))) {
                return 0;
            }
-            start_idx = hashindex_index(index, key);
+            /* we have just built a fresh hashtable and removed all tombstones,
+             * so we only have EMPTY or USED buckets, but no DELETED ones any more.
+             */
+            idx = hashindex_lookup(index, key, &start_idx);
+            assert(idx < 0);
+            assert(BUCKET_IS_EMPTY(index, start_idx));
        }
        idx = start_idx;
-        while(!BUCKET_IS_EMPTY(index, idx) && !BUCKET_IS_DELETED(index, idx)) {
-            idx++;
-            if (idx >= index->num_buckets){
-                idx -= index->num_buckets;
-            }
-        }
        if(BUCKET_IS_EMPTY(index, idx)){
            index->num_empty--;
            if(index->num_empty < index->min_empty) {
@ -758,14 +758,15 @@ hashindex_set(HashIndex *index, const unsigned char *key, const void *value)
                /* we have just built a fresh hashtable and removed all tombstones,
                 * so we only have EMPTY or USED buckets, but no DELETED ones any more.
                 */
-                idx = start_idx = hashindex_index(index, key);
-                while(!BUCKET_IS_EMPTY(index, idx)) {
-                    idx++;
-                    if (idx >= index->num_buckets){
-                        idx -= index->num_buckets;
-                    }
-                }
+                idx = hashindex_lookup(index, key, &start_idx);
+                assert(idx < 0);
+                assert(BUCKET_IS_EMPTY(index, start_idx));
+                idx = start_idx;
            }
+        } else if(BUCKET_IS_DELETED(index, idx)) {
+            /* as expected, nothing to do */
+        } else {
+            assert(0);  /* bucket is full, must not happen! */
        }
        ptr = BUCKET_ADDR(index, idx);
        memcpy(ptr, key, index->key_size);