Doubling an array that points to a linked list in C

https://stackoverflow.com/questions/22419229

15-06-2023
|

Question

I am implementing a hashset in C, where my array points to a linked list

this is the linked list:

typedef struct hashnode hashnode;

struct hashnode {
    char *word;
    // will hold our word as a string

    hashnode *link;
    //will be used only if chaining
};

and this is the Hashset:

struct hashset {
    size_t size;
    //size of entire array

    size_t load;
    //number of words total

    hashnode **chains;
    //linked list (if words have same index);
};

Now I am having a problem with my double array code I believe there is a dangling pointer somewhere here is the code:

void dbl_array(hashset *this) {
    size_t newlen = this->size +1;
    newlen *= 2;
    //double siz
    hashnode **new_array = malloc(newlen * sizeof(hashnode*));
    //new array
    int array_end = (int)this->size;//load;
    //end of old array
    for(int i = 0; i < array_end; i++) {
        //loop through old
        int index = i;
        if(this->chains[index] == NULL) {
            continue;
        }
        else {
            hashnode *nod;
            int i=0;
            for(nod = this->chains[index]; nod != NULL; nod = nod->link) {
                if(nod == NULL) 
                    return;
                size_t tmp = strhash(nod->word) % newlen;
                //compute hash
                hashnode *newnod;
                newnod = malloc(sizeof(hashnode*));
                newnod->word = strdup(nod->word);
                newnod->link = NULL;
                if(new_array[tmp] == NULL) { 
                    //if new array does not already have a word at index
                    new_array[tmp] = newnod;
                }
                else {
                    //if word is here then link to old one
                    newnod->link = new_array[tmp];
                    new_array[tmp] = newnod;
                }
                printf("newarray has: %s @ {%d} \n", new_array[tmp]->word, tmp);
                //testing insertion
                i++;
            }
            free(nod);
        }
    }
    this->chains = new_array;
    this->size = newlen;
    free(new_array);
    printf("new size %d\n", this->size);
}

So after running GDB, I am finding that there is something wrong when I add the new node

Solution

There is no reason at all to allocate new collision nodes for a hash table expansion. The algorithm for expanding your hash table is relatively straight forward:

compute new table size
allocate new table
enumerate all chains in old table
1. for each chain, enumerate all nodes
  1. for each node, compute new hash based on new table size
  2. move node to appropriate slot in new table

When the above is done, so are you. Just wire up the new table to the hashset and make sure to update the size member to the new size. The old table is discarded.

The following code assumes you have properly managed your hash table prior to doubling. With that:

All unused table slots are properly NULL
All collision lists are properly NULL-terminated.

If you can't guarantee both of those conditions, doubling the size of your hash table is the least of your worries.

void hashset_expand(hashset* hs)
{
    size_t new_size = 2 * (1 + hs->size), i, idx;
    hash node *next, *nod, **tbl = calloc(new_size, sizeof(*tbl));

    // walk old table, and each chain within it.
    for (i=0; i<hs->size; ++i)
    {
        next = hs->chains[i];
        while (next)
        {
            nod = next;
            next = next->link; // must be done **before** relink
            idx = strhash(nod->word) % new_size;
            nod->link = tbl[idx];
            tbl[idx] = nod;
        }
    }

    // finish up, deleting the old bed.
    free(hs->chains);
    hs->chains = tbl;
    hs->size = new_size;
}

That is all there is to it. Don't make it more complicated than that.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow