Question

I have the following text:

Matt has 11 eggs and they are brown
Helen has 23 ducks and they are black and brown
Todd has 34 quarters and they are silver
Bud has 45 pens and they are red, yellow, "greenish" and blue

When I run the following sed command:

sed -E "s/([^ ]+) has ([^ ]+) ([^ ]+) and they are (.*)/\"\1\",\"\2\",\"\3\",\"\4\"/" input

I get this CSV:

"Matt","11","eggs","brown"
"Helen","23","ducks","black and brown"
"Todd","34","quarters","silver"
"Bud","45","pens","red, yellow, "greenish" and blue"

But what I really want is this (quotes properly escaped):

"Matt","11","eggs","brown"
"Helen","23","ducks","black and brown"
"Todd","34","quarters","silver"
"Bud","45","pens","red, yellow, \"greenish\" and blue"

How can I accomplish this?

Was it helpful?

Solution

Try:

sed -E 's/"/\\"/g; 
  s/([^ ]+) has ([^ ]+) ([^ ]+) and they are (.*)/"\1","\2","\3","\4"/' input

This first replaces all " instances with \" and then performs your original command. Note how using single quotes around the sed program makes it a little more readable.

OTHER TIPS

This might work for you (GNU sed):

sed -r 's/"/\\&/g;s/^\\"|\\(",)\\"|\\"$/\1"/g'  file

Convert all "'s to \"'s and then remove the \'s from those at the start,end and inbetween.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top