Why does this example use null padding in string comparisons? “Programming Pearls”: Strings of Pearls

StackOverflow https://stackoverflow.com/questions/13245050

Question

In "Programming Pearls": Strings of Pearls, section 15.3 (Generating Text), the author introduces how to generate random text from an input document. In the source code, there are some things that I don't understand.

for (i = 0; i < k; i++)
        word[nword][i] = 0;

The author explains: "After reading the input, we append k null characters(so the comparison function doesn't run off the end)." This explanation really confuses me, since it still works well after commenting these two lines. Why is this necessary?

Était-ce utile?

La solution 2

Doing that reduces the number of weird cases you have to deal with when doing character-by-character comparisons.

 alphabet
 alpha___

If you stepped through this one letter at a time, and the null padding at the end of alpha wasn't there, you'd try to examine the next element... and run right off the end of the array. The null padding basically ensures that when there's a character in one word, there's a corresponding character in the other. And since the null character has a value of 0, the shorter word always going to be considered as 'less than' the longer one!

As to why it seems to work without those lines, there's two associated reasons I can think of:

  1. This was written in C. C does not guard its array boundaries; you can read whatever junk data is beyond the space that was allocated for it, and you'd never hear a thing.
  2. Your input document is made such that you never compare two strings where one is a prefix of the other (like alpha is to alphabet).

Autres conseils

As already explained in another answer, the purpose is to null terminate the string.

But I read the posted link and that loop doesn't make sense. If one looks at the comparison function used, there is no reason why the whole string must be filled with zeroes in this case. A plain word[nword][0] = 0; without the for loop would have worked just as fine. Or preferably:

word[nword][0] = '\0';

Filling the whole string with zeroes will add quite some overhead execution time.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top