How can you add to a hash value instead of having it overwrite with the new value?

StackOverflow https://stackoverflow.com/questions/10202648

  •  01-06-2021
  •  | 
  •  

Pregunta

Basically I have these files (medline from NCBI). Each is associated with a journal title. Each has 0, 1 or more genbank identification numbers (GBIDs). I can associate the number of GBIDs per file with each journal name. My problem is that I may have more than one file associated with the same journal, and I don't know how to add the number of GBIDs per file into a total number of GBIDs per journal.

My current code: jt stands for journal title, pulled out properly from the file. GBIDs are added to the count as encountered.

Full code:

 #!/usr/local/bin/ruby

 require 'rubygems'
 require 'bio'


Bio::NCBI.default_email = 'kepresto@uvm.edu'

ncbi_search = Bio::NCBI::REST::ESearch.new
ncbi_fetch = Bio::NCBI::REST::EFetch.new


print "\nQuery?\s" 

query_phrase = gets.chomp

"\nYou said \"#{query_phrase}\". Searching, please wait..."

pmid_list = ncbi_search.search("pubmed", "#{query_phrase}", 0)

puts "\nYour search returned #{pmid_list.count} results."

if pmid_list.count > 200
puts "\nToo big."
exit
end

gbid_hash = Hash.new
jt_hash = Hash.new(0)


pmid_list.each do |pmid|

ncbi_fetch.pubmed(pmid, "medline").each do |pmid_line|

    if pmid_line =~ /JT.+- (.+)\n/
        jt = $1
        jt_count = 0
        jt_hash[jt] = jt_count

        ncbi_fetch.pubmed(pmid, "medline").each do |pmid_line_2|

            if pmid_line_2 =~ /SI.+- GENBANK\/(.+)\n/
                gbid = $1
                jt_count += 1
                gbid_hash["#{gbid}\n"] = nil
            end 
        end 

        if jt_count > 0
            puts "#{jt} = #{jt_count}"

        end
        jt_hash[jt] += jt_count
    end
end
end


jt_hash.each do |key,value|
# if value > 0
    puts "Journal: #{key} has #{value} entries associtated with it. "
# end
end

# gbid_file = File.open("temp_*.txt","r").each do |gbid_count|
#   puts gbid_count
# end

My result:

 Your search returned 192 results.
 Virology journal = 8
 Archives of virology = 9
 Virus research = 1
 Archives of virology = 6
 Virology = 1

Basically, how do I get it to say Archives of virology = 15, but for any journal title? I tried a hash, but the second archives of virology just overwrote the first... is there a way to make two keys add their values in a hash?

¿Fue útil?

Solución

I don't entirely follow what you are asking for here.

However, you are overwriting your value for a given hash key because because you are doing this:

jt_count = 0
jt_hash[jt] = jt_count

You already initialized your hash earlier like this:

jt_hash = Hash.new(0)

That is, every key will have a default value of 0. Thus, there's no need to do initialize jt_hash[jt] to 0.

If you remove this line:

 jt_hash[jt] = jt_count

Then the values for jt_hash[jt] should accumulate for each pass through the loop

ncbi_fetch.pubmed(pmid, "medline").each do |pmid_line|
  ....
end

Otros consejos

Change these two lines:

   jt_count = 0
   jt_hash[jt] = jt_count

to this:

   if jt_hash[jt] == nil
      jt_count = 0
      jt_hash[jt] = jt_count
   else
      jt_count = jt_hash[jt]
   end

This just check the hash for a null value at that key and if it is null stick an integer in it. If it is not null then return the previous integer so you can add to it.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top