So the field patterns are described as the following.
A string not containing a comma where the string length is greater than zero (won't match empty strings):
[^,]+
Or a string starting and ending with a double quotes and containing at least one character that isn't a double quote (escaping backslashes left out for readability):
"[^"]+"
Regular expression engine match from the beginning of the string and try to match as much as possible given the patterns.
abc,"pqr,mno"
So abc
is longest string matched by either pattern from the start of the string and hence becomes $1
. The next character ,
cannot be matched by either pattern so the regular expression engine just moves to the next character "
with starts matching the second pattern. This is matched until the end of line as "pqr,mno"
is a string that starts and ends with double quotes and contains at least one non-double-quote character. Therefore "pqr,mno"
become $2
for the record abc,"pqr,mno"
.