문제

This is my code for calculate word frequency

  word_arr= ["I", "received", "this", "in", "email", "and", "found", "it", "a", "good", "read", "to", "share......", "Yes,", "Dr", "M.", "Bakri", "Musa", "seems", "to", "know", "what", "is", "happening", "in", "Malaysia.", "Some", "of", "you", "may", "know.", "He", "is", "a", "Malay",  "extra horny", "horny nor", "nor their", "their babes", "babes are", "are extra", "extra SEXY..", "SEXY.. .", ". .", ". .It's", ".It's because", "because their", "their CONDOMS", "CONDOMS are", "are Made", "Made In", "In China........;)", "China........;) &&"]

arr_stop_kwd=["a","and"] 

 frequencies = Hash.new(0)
   word_arr.each { |word|
      if !arr_stop_kwd.include?(word.downcase) && !word.match('&&')
        frequencies["#{word.downcase}"] += 1
      end
   }

when i have 100k data it will take 9.03 seconds,that,s to much time can i calculate any another way

Thx in advance

도움이 되었습니까?

해결책

Take a look at Facets gem

You can do something like this using the frequency method

require 'facets'
frequencies = (word_arr-arr_stop_kwd).frequency

Note that stop word can be subtracted from the word_arr. Refer to Array Documentation.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top