質問

This is my code for calculate word frequency

  word_arr= ["I", "received", "this", "in", "email", "and", "found", "it", "a", "good", "read", "to", "share......", "Yes,", "Dr", "M.", "Bakri", "Musa", "seems", "to", "know", "what", "is", "happening", "in", "Malaysia.", "Some", "of", "you", "may", "know.", "He", "is", "a", "Malay",  "extra horny", "horny nor", "nor their", "their babes", "babes are", "are extra", "extra SEXY..", "SEXY.. .", ". .", ". .It's", ".It's because", "because their", "their CONDOMS", "CONDOMS are", "are Made", "Made In", "In China........;)", "China........;) &&"]

arr_stop_kwd=["a","and"] 

 frequencies = Hash.new(0)
   word_arr.each { |word|
      if !arr_stop_kwd.include?(word.downcase) && !word.match('&&')
        frequencies["#{word.downcase}"] += 1
      end
   }

when i have 100k data it will take 9.03 seconds,that,s to much time can i calculate any another way

Thx in advance

役に立ちましたか?

解決

Take a look at Facets gem

You can do something like this using the frequency method

require 'facets'
frequencies = (word_arr-arr_stop_kwd).frequency

Note that stop word can be subtracted from the word_arr. Refer to Array Documentation.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top