Question

I am trying to learn F#. And I need som help with a simple soundex expression. I am using the following ruleset for Simplified (also called American) soundex:

1.) Assign characters to classes
2.) Remove duplicate values here, e.g. 222 becomes 2  
3.) Replace first encoded char with first char  
4.) Remove nulls
5.) Truncate ot pad to totally 4 characters

Currently I am stuck on rule no. 2. I was thinking of using a recursive expression. As I am currently a n00b on F# I will try to ask you for an elegant solution to my problem.Maybe my entire approach to translate text to soundex is off target?

Any suggestions will be greatly appreciated :)

Here is my code:

let Simplified (name:string) =
let ca = name.ToLower().ToCharArray()
new string(
    Array.map(
        fun e ->
        match e with                                                          
            | 'a' | 'e' | 'i' | 'o' | 'u' | 'y' | 'w' | 'h' -> '0'
            | 'b' | 'f' | 'p' | 'v'                         -> '1'
            | 'c' | 's' | 'k' | 'g' | 'j' | 'q' | 'x' | 'z' -> '2'
            | 'd' | 't'                                     -> '3'
            | 'l'                                           -> '4'
            | 'm' | 'n'                                     -> '5'
            | 'r'                                           -> '6'
            |  _                                            -> ' '
        )  ca
  //|> fun s -> TODO: Remove duplicates here
    |> fun s -> Array.set s 0 (ca.[0]) 
                Array.choose(fun e -> if e <> '0' then Some(e) else None) s   
)  
|> fun s -> (
            match s.Length with                                               
                | x when x < 3 -> s.PadRight(4, '0')
                | _ -> s.Substring(0, 4)
            ).ToUpper()
Was it helpful?

Solution

Seq.fold is your friend.

let soundex (text : string) = 
    let choose = 
        function 
        | 'b' | 'f' | 'p' | 'v' -> Some "1" 
        | 'c' | 'g' | 'j' | 'k' | 'q' | 's' | 'x' | 'z' -> Some "2" 
        | 'd' | 't' -> Some "3" 
        | 'l' -> Some"4" 
        | 'm' | 'n'  -> Some "5"
        | 'r' -> Some "6"
        | _ -> None 

    let fold state value = 
        match state with
        | i :: _ when i = value -> state
        | _ -> value :: state

    let t = text.Substring(1).ToLower() |> Seq.choose choose |> Seq.fold fold [] |> Seq.toList |> List.rev |> String.concat ""

    text.Substring(0,1) + t.PadRight(3, '0').Substring(0, 3)

This is based on the wikipedia article for soundex.

OTHER TIPS

If you want to remove consequent duplicates (the second option in the zeuxcg's solution), then you can also implement this directly as a recursive function (using accumulator parameter). This nicely demonstrates pattern matching, so it is good thing to try while learning F#:

let removeConsequentDuplicates list = 
  let rec loop acc list =
    match list with 
    | x1::x2::xs when x1 = x2 -> loop acc (x2::xs)
    | x::xs -> loop (x::acc) xs
    | _ -> acc |> List.rev
  loop [] list

This version works with lists, but since you're working with arrays, you'll probably need an imperative version. You can use sequence expressions like this:

let removeConsequentDuplicates (arr:_[]) = 
  let rec loop last i = seq {
    if i < arr.Length - 1 && last = arr.[i] then 
      yield! loop last (i+1)
    elif i < arr.Length - 1 then
      yield arr.[i]
      yield! loop (arr.[i]) (i + 1) }
  [| if arr.Length > 0 then
       yield arr.[0]
       yield! loop arr.[0] 0 |]

As a side-note, I find your syntax a bit unreadable. I don't think it is a good idea to write ... |> fun s -> ..., because it is just an obscured version of let s = ... in .... I would recommend writing something like (I'm not sure I fully understand your code, but you get the idea...):

let Simplified (name:string) =
  let ca = name.ToLower().ToCharArray()
  let s = 
    ca |> Array.map (function
            | '0' ... )
       |> removeConsequentDuplicates
  Array.set s 0 (ca.[0])
  let s = s |> Array.choose(fun e -> if e <> '0' then Some(e) else None)
  let s = (new String(s)).ToUpper()
  match s.Length with                                               
  | x when x < 3 -> s.PadRight(4, '0')
  | _ -> s.Substring(0, 4)

Remove consecutive duplicates using arrays using loops rather than recursion, most simply in a sequence expression like this:

let removeDuplicates (xs: _ []) =
  [|if xs.Length > 0 then yield xs.[0]
    for i=1 to xs.Length-1 do
      if xs.[i] <> xs.[i-1] then
        yield xs.[i]|]

If you want to remove all duplicates from the array (leaving unique elements), the following will do:

arr |> Seq.distinct |> Seq.toArray

If you want to remove consecutive duplicates, then the solution is harder. This is the simplest one I can come up with:

let unique list =
    list
    |> List.fold (fun acc e ->
        match acc with
        | x::xs when x = e -> acc
        | _ -> e::acc) []
    |> List.rev

You can do it with arrays either via Array.toList and Array.ofList or using Array.fold and changing match expression and list construction; the code is less readable so I'm posting the List version.

Alternative solutions involve Seq.pairwise, i.e.:

let unique arr =
    if Array.isEmpty arr then
        arr
    else
        Array.append [|arr.[0]|] (
            arr
            |> Seq.pairwise
            |> Seq.toArray
            |> Array.choose (fun (p, n) -> if p = n then None else Some n))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top