Question

I was going over this example of a Trie and I am having difficulty understanding the following code

void Trie::addWord(string word)
{
    Node * currentNode = root;

    for (int i = 0; i < word.size(); ++i)
    {
        char currentChar = tolower(word.at(i));
        int index = currentChar - 'a';
        assert(index >= 0);     // Makes sure the character is between a-z
        if (currentNode->children[index] != NULL)
        {
            // check if the current node has the current character as one of its decendants
            currentNode = currentNode->children[index];
        }
        else
        {
            // the current node doesn't have the current character as one of its decendants
            Node * newNode = new Node(currentChar);
            currentNode->children[index] = newNode;
            currentNode = newNode;
        }
        if (i == word.size() - 1)
        {
            // the last character of the word has been reached
            currentNode->end = true;
        }
    }
}

My question is why is a being subtracted here

 int index = currentChar - 'a';
Was it helpful?

Solution

At the line int index = currentChar - 'a'; The currentChar (what ever it is) will be subtracted by 'a' character which has the ASCII value 97.
In this case you have two conditions:

  1. First If currentChar is between a-z, the result of index will always be >= 0.

  2. Otherwise currentChar is not between a-z because index will be negative number, currentChar can't be between A-Z because of the tolower() function.

You can refeare to this link to know more about ASCII values

Also you need to update the condition assert(index >= 0 && index < 26) because { ,} ,| and ~ will make index >=0

OTHER TIPS

It will give you the letter position in alphabeta consider a as the position 0. Here a is converted to its ASCII code and it is subsctracted from the a-z letter became more easy to handle.

What people tend to forget - a character is a byte with the ascii value in it.

so 'a' is 97 in the ascii table (I think), so all is being done - is currentChar-97.

please follow the http://www.asciitable.com for the actual values.

but always remember - a char is a byte, and you can use this fact a lot!
I am using it for skipping memory addresses in pointers (for example:

void* a = (void*)((char*)arr+2);

This code will return the memory of arr+ two bytes, while ignoring the type that is being held within arr)

Please do notice - |'a'-'A'| = |'A'-'a'| , but one will be negative, while the other one will be positive

int index = currentChar - 'a'; this is to check the character because the user can enter a character that is not between a to z.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top