Question

I am a newcomer in Ruby and I need help I have a sample part of .txt

Modified : /Analitics/Documents/HTZ/BR-5545 Credit/Example BR-5545.docx
Modified : /Analitics/Documents/HTZ/BR-5545 Credit/HTZ BR-5545 Example.docx

I need to find only digits in rows and only one time. (unique set of digits that appears only one time) With regexp I find digits

line=~/(BR-\d*)/
my=line.scan(/(BR-\d*)/)

Output:

`[["BR-5545"], ["BR-5545"]]`

But I need one time:

`[["BR-5545"]`

Please,help me how to transformate my regexp

Was it helpful?

Solution 3

Given an input.txt file like this:

Modified : /Analitics/Documents/HTZ/BR-5545 Credit/Example BR-5545.docx
Modified : /Analitics/Documents/HTZ/BR-5545 Credit/HTZ BR-5545 Example.docx

You could obtain what you want with this

File.open('input.txt').inject([]) do |array, line|
  array << line.scan(/(BR-\d*)/)
end.flatten.uniq

Basically:

  • we open the file
  • we start injecting the result of the iteration into the array variable, which is initialized to []
  • we scan each line for the desired regexp
  • after collecting all the results, we flatten it so that we have a one-dimensional array
  • then we call uniq on it to remove duplicates

OTHER TIPS

Just add uniq! after scanning:

data = "Modified : /Analitics/Documents/HTZ/BR-5545 Credit/Example BR-5545.docx"
data.scan(/(BR-\d*)/).uniq! # [["BR-5545"]

Use a Set instead of an array:

require 'set'
lines=[
    'Modified : /Analitics/Documents/HTZ/BR-5545 Credit/Example BR-5545.docx',
    'Modified : /Analitics/Documents/HTZ/BR-5545 Credit/HTZ BR-5545 Example.docx'
]

lines.inject(Set.new) {|s, l| s.merge(l.scan(/BR-\d+/)); s}
# => #<Set: {"BR-5545"}>

# or as an array
lines.inject(Set.new) {|s, l| s.merge(l.scan(/BR-\d+/)); s}.to_a
# => ["BR-5545"]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top