Question

Does soundex function in sqlite has limitation on string length? As I found that the result of

SELECT soundex('Schneider Thomson'), soundex('Schneider Rene'), soundex('Schneider')

Are all the same value which equal S536, However the result of :

SELECT soundex('Schn Thomson'), soundex('Schn Rene'), soundex('Schn');

Is different for each string and the values is

soundex('Schn Thomson') = S535 
soundex('Schn Rene')    = S565
soundex('Schn')         = S500

Please can any one explain why?

Was it helpful?

Solution

The Soundex algorithm is desgined to work on single words. (To simplify, it encodes the first letter and the first three following consonants.)

OTHER TIPS

To provide a little more clarification to CL's answer, the encoding uses the first letter and then encodes the following consonants (with the exception of H,W,Y) until 3 digits have been generated. Mississippi is a word that illustrates this well. MISSISSIPPI has a SOUNDEX of M210.

  1. M is the first letter followed by the first consonant S. S is then repeated and ignored by SOUNDEX encoding.
  2. The next consonant is P and it is not followed by another valid letter (just a repeated p and an i).
  3. Thus a zero is the final digit.

Hopefully that provides a little more clarification on how SOUNDEX encodes words. For a little more information, this article from Genealogy.com explains how to use SOUNDEX when researching names. This would explain why supercell and supercalifragilisticexpialidocious have the same SOUNDEX, S162.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top