Counting word frequency in a database in Rails
-
19-06-2021 - |
Question
I have a Rails app with a database that has two columns (name and description). I would like to run a script that will find all unique words in either column, and rank them according to the frequency with which they occur. This is for the purpose of generating an index.
I understand that I will need to exclude certain words (such as "the" and "a") and that the count may be imperfect because of pluralization. But I am happy to handle this manually in post processing, and am just looking for a basic script that will give me all of the words and their frequency.
Does anyone have any code that would do this or any guidance as to how to go about it?
Solution
def unique_word_count
@thing = Thing.all
@hash = Hash.new(0)
@thing.each do |thing|
name_array = thing.name.split(' ')
description_array = thing.description.split(' ')
end
name_array.each do |word|
@hash[word] += 1
end
description_array.each do |word|
@hash[word] += 1
end
end
I haven't ran the code, but something like this is probably what you are looking for.