Question

I have written this F# code to count word frequencies in a list and return a tuple to C#. Could you tell me how can I make the code more efficient or shorter?

let rec internal countword2 (tail : string list) wrd ((last : string list), count) =
match tail with
| [] -> last, wrd, count
| h::t -> countword2 t wrd (if h = wrd then last, count+1 else last @ [h], count)

let internal countword1 (str : string list) wrd =
let temp, wrd, count = countword2 str wrd ([], 0) in
temp, wrd, count

let rec public countword (str : string list) =
match str with
| [] -> []
| h::_ ->
  let temp, wrd, count = countword1 str h in
       [(wrd, count)] @ countword temp
Was it helpful?

Solution

If you want to count word frequencies in a string list, your approach seems to be overkill. Seq.groupBy is well-fitted for this purpose:

let public countWords (words: string list) = 
   words |> Seq.groupBy id
         |> Seq.map (fun (word, sq) -> word, Seq.length sq)
         |> Seq.toList

OTHER TIPS

Even pad's version can be made more efficient and concise:

let countWords = Seq.countBy id

Example:

countWords ["a"; "a"; "b"; "c"] //returns: seq [("a", 2); ("b", 1); ("c", 1)]

Your solution iterates over the input list several times, for every new word that it founds. Instead of doing that, you could iterate over the list just once and build a dictionary that holds the number of all occurrences for every word.

To do this in a functional style, you can use F# Map, which is an immutable dictionary:

let countWords words = 
  // Increment the number of occurrences of 'word' in the map 'counts'
  // If it isn't already in the dictionary, add it with count 1
  let increment counts word =
    match Map.tryFind word counts with
    | Some count -> Map.add word (count + 1) counts
    | _ -> Map.add word 1 counts

  // Start with an empty map and call 'increment' 
  // to add all words to the dictionary
  words |> List.fold increment Map.empty

You can also implement the same thing in an imperative style, which is going to be more efficient, but less elegant (and you don't get all benefits of functional style). However, standard mutable Dictionary can be used nicely from F# too (this is going to be similar to C# version, so I won't write it here).

Finally, if you want a simple solution using just standard F# functions, you can use Seq.groupBy as suggested by pad. This would be probably almost as efficient as the Dictionary based version. But then, if you're just learning F# then writing a few recursive functions like countWords yourself is a great way to learn!

To give you some comments about your code - the complexity of your approach is slightly higher, but that should probably be fine. There are however some common isses:

  • In your countword2 function, you have if h = wrd then ... else last @ [h], count. The call last @ [h] is inefficient, because it needs to clone the entire list last. Instead of this, you could just write h::last to add the word to the beginning, because the order does not matter.

  • On the last line, you're using @ again in [(wrd, count)] @ countword temp. This is not necessary. If you're adding single element to the beginning of list, you should use: (wrd,count)::(countword temp).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top