There's a couple of edge cases to be aware of:
- what if doubled double-quotes are at the beginning of a string?
- What if that string is the first field?
- a field containing an empty string
sed -r '
# at the start of a line or the start of a field,
# replace """ with "\"
s/(^|;)"""/\1"\\"/g
# replace any doubled double-quote with an escaped double-quote.
# this affects any "inner" quote pair as well as end of field or end of line
# if there is an escaped quote from the previous command, don't be fooled by
# a proceeding quote.
s/([^\\])""/\1\\"/g
# the above step will destroy empty strings. fix them here. this uses a
# conditional loop: if there are 2 consecutive empty fields, they will
# share a delimited, so we have to process the line more than once
:fix_empty_fields
s/(^|;)\\"($|;)/\1""\2/g
tfix_empty_fields
' <<'END'
"""start of line";"""beginning "" middle and end """;"end of line"""
"";"";"";"""";"""""";"";""
END
"\"start of line";"\"beginning \" middle and end \"";"end of line\""
"";"";"";"\"";"\"\"";"";""
Sed is an efficient tool, but it will take a while with 16GB files. And you better have at least 16GB free disk space to write the updated files (even sed's -i
inplace-edit uses temp files behind the scenes)