Question

I have a Rails app with a database that has two columns (name and description). I would like to run a script that will find all unique words in either column, and rank them according to the frequency with which they occur. This is for the purpose of generating an index.

I understand that I will need to exclude certain words (such as "the" and "a") and that the count may be imperfect because of pluralization. But I am happy to handle this manually in post processing, and am just looking for a basic script that will give me all of the words and their frequency.

Does anyone have any code that would do this or any guidance as to how to go about it?

Was it helpful?

Solution

def unique_word_count
  @thing = Thing.all
  @hash = Hash.new(0)
  @thing.each do |thing|
    name_array = thing.name.split(' ')
    description_array = thing.description.split(' ')
  end
  name_array.each do |word|
    @hash[word] += 1
  end
  description_array.each do |word|
    @hash[word] += 1
  end
end

I haven't ran the code, but something like this is probably what you are looking for.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top