F# - Remove subsequent duplicates from an array of characters (unordered)
Question
I am trying to learn F#. And I need som help with a simple soundex expression. I am using the following ruleset for Simplified (also called American) soundex:
1.) Assign characters to classes
2.) Remove duplicate values here, e.g. 222 becomes 2
3.) Replace first encoded char with first char
4.) Remove nulls
5.) Truncate ot pad to totally 4 characters
Currently I am stuck on rule no. 2. I was thinking of using a recursive expression. As I am currently a n00b on F# I will try to ask you for an elegant solution to my problem.Maybe my entire approach to translate text to soundex is off target?
Any suggestions will be greatly appreciated :)
Here is my code:
let Simplified (name:string) =
let ca = name.ToLower().ToCharArray()
new string(
Array.map(
fun e ->
match e with
| 'a' | 'e' | 'i' | 'o' | 'u' | 'y' | 'w' | 'h' -> '0'
| 'b' | 'f' | 'p' | 'v' -> '1'
| 'c' | 's' | 'k' | 'g' | 'j' | 'q' | 'x' | 'z' -> '2'
| 'd' | 't' -> '3'
| 'l' -> '4'
| 'm' | 'n' -> '5'
| 'r' -> '6'
| _ -> ' '
) ca
//|> fun s -> TODO: Remove duplicates here
|> fun s -> Array.set s 0 (ca.[0])
Array.choose(fun e -> if e <> '0' then Some(e) else None) s
)
|> fun s -> (
match s.Length with
| x when x < 3 -> s.PadRight(4, '0')
| _ -> s.Substring(0, 4)
).ToUpper()
Solution
Seq.fold is your friend.
let soundex (text : string) =
let choose =
function
| 'b' | 'f' | 'p' | 'v' -> Some "1"
| 'c' | 'g' | 'j' | 'k' | 'q' | 's' | 'x' | 'z' -> Some "2"
| 'd' | 't' -> Some "3"
| 'l' -> Some"4"
| 'm' | 'n' -> Some "5"
| 'r' -> Some "6"
| _ -> None
let fold state value =
match state with
| i :: _ when i = value -> state
| _ -> value :: state
let t = text.Substring(1).ToLower() |> Seq.choose choose |> Seq.fold fold [] |> Seq.toList |> List.rev |> String.concat ""
text.Substring(0,1) + t.PadRight(3, '0').Substring(0, 3)
This is based on the wikipedia article for soundex.
OTHER TIPS
If you want to remove consequent duplicates (the second option in the zeuxcg's solution), then you can also implement this directly as a recursive function (using accumulator parameter). This nicely demonstrates pattern matching, so it is good thing to try while learning F#:
let removeConsequentDuplicates list =
let rec loop acc list =
match list with
| x1::x2::xs when x1 = x2 -> loop acc (x2::xs)
| x::xs -> loop (x::acc) xs
| _ -> acc |> List.rev
loop [] list
This version works with lists, but since you're working with arrays, you'll probably need an imperative version. You can use sequence expressions like this:
let removeConsequentDuplicates (arr:_[]) =
let rec loop last i = seq {
if i < arr.Length - 1 && last = arr.[i] then
yield! loop last (i+1)
elif i < arr.Length - 1 then
yield arr.[i]
yield! loop (arr.[i]) (i + 1) }
[| if arr.Length > 0 then
yield arr.[0]
yield! loop arr.[0] 0 |]
As a side-note, I find your syntax a bit unreadable. I don't think it is a good idea to write ... |> fun s -> ...
, because it is just an obscured version of let s = ... in ...
. I would recommend writing something like (I'm not sure I fully understand your code, but you get the idea...):
let Simplified (name:string) =
let ca = name.ToLower().ToCharArray()
let s =
ca |> Array.map (function
| '0' ... )
|> removeConsequentDuplicates
Array.set s 0 (ca.[0])
let s = s |> Array.choose(fun e -> if e <> '0' then Some(e) else None)
let s = (new String(s)).ToUpper()
match s.Length with
| x when x < 3 -> s.PadRight(4, '0')
| _ -> s.Substring(0, 4)
Remove consecutive duplicates using arrays using loops rather than recursion, most simply in a sequence expression like this:
let removeDuplicates (xs: _ []) =
[|if xs.Length > 0 then yield xs.[0]
for i=1 to xs.Length-1 do
if xs.[i] <> xs.[i-1] then
yield xs.[i]|]
If you want to remove all duplicates from the array (leaving unique elements), the following will do:
arr |> Seq.distinct |> Seq.toArray
If you want to remove consecutive duplicates, then the solution is harder. This is the simplest one I can come up with:
let unique list =
list
|> List.fold (fun acc e ->
match acc with
| x::xs when x = e -> acc
| _ -> e::acc) []
|> List.rev
You can do it with arrays either via Array.toList
and Array.ofList
or using Array.fold
and changing match expression and list construction; the code is less readable so I'm posting the List version.
Alternative solutions involve Seq.pairwise
, i.e.:
let unique arr =
if Array.isEmpty arr then
arr
else
Array.append [|arr.[0]|] (
arr
|> Seq.pairwise
|> Seq.toArray
|> Array.choose (fun (p, n) -> if p = n then None else Some n))