Question

Fellow Regexers,

I have a flat file full of expressions like:

SELECT * FROM CONVENIENT_ONE_LINE_QUERY
"SELECT * FROM THIS_QUERY
WHERE IS_SPREAD_OVER == 123
ORDER BY MULTIPLE_LINES
HAVING AND_IS_BETWEEN_QUOTES"
SELECT * FROM ANOTHER_CONVENIENT_ONE_LINER

I want to eliminate the CRLF between the quotes and the quotes themselves, so that all my queries are convenient one-liners like that:

SELECT * FROM CONVENIENT_ONE_LINE_QUERY
SELECT * FROM THIS_QUERY WHERE IS_SPREAD_OVER == 123 ORDER BY MULTIPLE_LINES HAVING BUT_IS_BETWEEN_QUOTES
SELECT * FROM ANOTHER_CONVENIENT_ONE_LINER

Please post the RegEx flavor used in the solution. I'm using TextCrawler, which claims to be ECMA262 (same as VBScript/Javascript) and the closest I came to a solution is something like:

(\r\n".*)(.*)\r\n(.*"\r\n)

Forgive my n00biness. Best regards, Lynx Kepler

Was it helpful?

Solution

You could first remove all CRLFs if the next " is at the end of a line:

result = subject.replace(/\r\n(?=[^"]*"$)/mg, " ");

Explanation:

\r\n    # Match a CRLF
(?=     # if and only if
 [^"]*  # it is followed by any number of non-quote characters
 "      # and a quote
 $      # at the end of a line
)       # End of lookahead.

This transforms your example into

SELECT * FROM CONVENIENT_ONE_LINE_QUERY
"SELECT * FROM THIS_QUERY WHERE IS_SPREAD_OVER == 123 ORDER BY MULTIPLE_LINES HAVING AND_IS_BETWEEN_QUOTES"
SELECT * FROM ANOTHER_CONVENIENT_ONE_LINER

enter image description here

Then, in a second step, remove the quotes:

result = subject.replace(/^"|"$/mg, "");

OTHER TIPS

With Perl you could do something like:

s/^"([^"]*)"$/$s = $1; $s =~ s!(?:\n|\r)+! !g; $s/meg

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top