質問

I need to convert a list of IDs from using a delimiter consisting of , and/or \r\n or \n to using ,|. (essentially: s/[,\r\n]+/,\|/g without a trailing |)

Example input data:

123,456,789,012

or

123,
456
789,
012

and I need the resulting output to be 123,|456,|798,|012,: a comma ending each field, and a pipe separating them.

This seems really simple to do, but I'm quite stumped on how to manage this. I've tried ... quite a few ways, actually, but nothing seems to work. Here are a few examples:

  1. sed "s/[,\r\n]+/,\|/g" < filename does not match any of the delimiters.

  2. sed "s/(,|,?\r?\n?)/,\|/g" does not match anything either.

  3. tr -t "(,?(\r|\n)+)" ",\|" and tr -t "[,\r\n]+" ",\|" only replace ,

  4. tr "(,|\r?\n)" ",\|" works correctly with , but with ,\n and ,\r\n it replaces the matched characters with multiple bars. Ex: 123|||456|||789|||012|

  5. Getting more complex: sed ':a;N;$!ba;s/\n/,/g" (Taken from here) replaces \n correctly with , but does not work with \r\n. Replacing the \n with [,\r\n] simply returns the input.

I'm stumped. Can anyone offer some help or advice on this?

役に立ちましたか?

解決

From your sample output, it seems that the output doesn't have a pipe at the end; you have , marking the end of each field, and | separating pairs of fields. For that specification, this works with tr and sed:

$ x="123,
> 456
> 789,
> 012"
$ echo "$x" | tr -s '\r\n' ',' | sed 's/,\(.\)/,|\1/g'
123,|456,|789,|012,
$

The tr command replaces newline and carriage return with comma, squeezing (-s) duplicates. The sed command looks for a comma followed by another character and replaces it with ,|.

他のヒント

What I do is normalize the \r\n sequence to \n to get rid of one alternative (and increase the speed of the next step).

perl -pi -e 'BEGIN { $/ = undef; } s/\r\n/\n/g; s/[,\n]/,|/g;'

Update: from your examples, it looks like you meant to replace multiple occurrences of delimiters with nothing in between them with a single occurence of ,| If that is what you want to do, then change the command to this:

perl -pi -e 'BEGIN { $/ = undef; } END { print ",\n"; } s/\r\n/\n/g; s/[,\n]+/,|/g;'

Also, you want a trailing , after the last field.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top