Question

I have a bunch of string with special escape codes that I want to store unescaped- eg, the interpreter shows

"\\014\"\\000\"\\016smoothing\"\\011mean\"\\022color\"\\011zero@\\016" but I want it to show (when inspected) as "\014\"\000\"\016smoothing\"\011mean\"\022color\"\011zero@\016"

What's the method to unescape them? I imagine that I could make a regex to remove 1 backslash from every consecutive n backslashes, but I don't have a lot of regex experience and it seems there ought to be a "more elegant" way to do it.

For example, when I puts MyString it displays the output I'd like, but I don't know how I might capture that into a variable.

Thanks!

Edited to add context: I have this class that is being used to marshal / restore some stuff, but when I restore some old strings it spits out a type error which I've determined is because they weren't -- for some inexplicable reason -- stored as base64. They instead appear to have just been escaped, which I don't want, because trying to restore them similarly gives the TypeError TypeError: incompatible marshal file format (can't be read) format version 4.8 required; 92.48 given because Marshal looks at the first characters of the string to determine the format.

require 'base64'
class MarshaledStuff < ActiveRecord::Base

  validates_presence_of :marshaled_obj

  def contents
    obj = self.marshaled_obj
    return Marshal.restore(Base64.decode64(obj))
  end

  def contents=(newcontents)
    self.marshaled_obj = Base64.encode64(Marshal.dump(newcontents))
  end
end

Edit 2: Changed wording -- I was thinking they were "double-escaped" but it was only single-escaped. Whoops!

Was it helpful?

Solution

If your strings give you the correct output when you print them then they are already escaped correctly. The extra backslashes you see are probably because you are displaying them in the interactive interpreter which adds extra backslashes for you when you display variables to make them less ambiguous.

> x
=> "\\"
> puts x
\
=> nil
> x.length
=> 1

Note that even though it looks like x contains two backslashes, the length of the string is one. The extra backslash is added by the interpreter and is not really part of the string.

If you still think there's a problem, please be more specific about how you are displaying the strings that you mentioned in your question.


Edit: In your example the only thing that need unescaping are octal escape codes. You could try this:

x = x.gsub(/\\[0-2][0-7]{2}/){ |c| c[1,3].to_i(8).chr }
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top