Question

I am working on a VERY simple script to clean up a few hundred thousand small XML files. My current method is to iterate through the directory and (for each file) read the file, use String::gsub! to make all my changes (not sure if this is best) and then I write the new contents to the file. My code looks something like the following:

Dir.entries('.').each do |file_name|

  f = File.read( file_name )

  f.gsub!( /softwareiconneedsshine>(.|\s)*<\/softwareiconneedsshine>/i, '' )
  f.gsub!( /<rating>(.|\s)*<\/rating>, '' )

  f.gsub!( /softwareIdentifiers>/, 'version_history>' )

  #some more regex's

  File.open( file_name, 'w' ) { |w| w.write(f) }

end

This all looks fine and dandy, but for some reason (that I, for the life of me, cannot figure out) the program hangs seemingly arbitrarily at the gsub! commands that are similar to the first two shown. However, it hangs randomly (but only at those points). Sometimes it works, other times is just hangs. I really can't figure out why it would work sometimes but not all other times???

Any help is greatly appreciated!!

Was it helpful?

Solution

Without knowing anything else about your environment, or the type of files you're reading, I would suggest trying to make your kleene stars to be non-greedy. Like, change (.|\s)* to (.|\s)*?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top