Question

I am learning Ruby and I am trying to apply Ruby to extract related codes from a hash and do not understand how to identify them in a hash. The codes have been extracted from the 2014 Mesh Tree Codes file from the NLM website. The codes are associated with MeSH terms, and appear in the file as follows (using the term "Motor Activity as an example):

Motor Activity;F01.145.632

I have this information in a hash using the code as the key and term as the value. I need to extract the related terms using their codes; the parent would contain three fewer digits, the siblings would have different last three digits, and the children would have the exact same code plus any number of additional digits in the form .XXX.XXX; an example of these codes is as follows:

Motor Activity [F01.145.632]
Behavior and Behavior Mechanisms [F01]              
Behavior [F01.145]
Information Seeking Behavior [F01.145.535]          
Inhibition (Psychology) [F01.145.544]           
Freezing Reaction, Cataleptic [F01.145.632.555]     
Immobility Response, Tonic [F01.145.632.680]

So far, I have opened the file and saved the codes as the keys and the terms as the values. The script is as follows:

mesh = File.open('mtrees2014.bin').read
mesh.gsub!(/\r?\n/)
mesh.each_line do |line|
  line.chomp!
  mesh_descriptor, tree_code = line.split(/\;/)
  descriptor_code_hash[tree_code] = "#{mesh_descriptor}"
end

I need to understand how to extract the first term (motor activity:F01.145.632), then the siblings (F01.145.632 with last three digits different), children (F01.145.632 with any number of additional digits .XXX.XXX), and parents (F01.145.632 less last three digits) from the hash. Can this be done with regular expressions? Or, some other strategy? I will then be saving these codes and terms into another hash. Thank you for taking the time to read this! Any suggestions would be greatly appreciated!

Was it helpful?

Solution

motor_code = 'F01.145.632'

parents = descriptor_code_hash.select do |k, v|
  motor_code[/^#{k}/] && motor_code != k 
end.map { |k, v| v }
# => ["Behavior and Behavior Mechanisms", "Behavior"] 

siblings = descriptor_code_hash.select do |k, v| 
  k =~ /^#{motor_code.split('.')[0..-2].join('\.')}\.\d{3}/ && k != motor_code 
end.map { |k, v| v }
# => ["Information Seeking Behavior", "Inhibition (Psychology)", "Freezing Reaction, Cataleptic", "Immobility Response, Tonic"]

children = descriptor_code_hash.select do |k, v| 
  k =~ /^#{motor_code}\.[\d\.]*/ 
end.map { |k, v| v }
# => ["Freezing Reaction, Cataleptic", "Immobility Response, Tonic"] 

parents are found by looking for all keys which are prefixes to the motor_code.
siblings are found by looking for all keys which are prefixed by the parent key of motor_code (removing the last three digits, and expecting exactly three digits.
children are found by looking for all keys which are prefixed by motor_code

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top