Replace characters using regex grouping with sed
-
24-10-2019 - |
Question
I have a text file that is like this:
FOO BAR PIPPO PLUTO 31337 1010
FOOZ BAZ 130
VERY LONG LINE LIKE THIS THEN A NUMBER LIKE 42
I need to turn it into:
FOO-BAR-PIPPO-PLUTO 31337 1010
FOOZ-BAZ 130
VERY-LONG-LINE-LIKE-THIS-THEN-A-NUMBER-LIKE 42
The best I could do is:
sed -re 's/([A-Z]+)( )([A-Z]+)/\1-\3/g'
but the output is
FOO-BAR PIPPO-PLUTO 31337 1010
FOOZ-BAZ 130
VERY-LONG LINE-LIKE THIS-THEN A-NUMBER LIKE 42
Close, but no cigar. Any idea on why my regex doesn't work?
Solution
You can't have overlapping matches. "BAR PIPPO" isn't detected because "BAR" was already consumed when matching "FOO BAR".
FOO BAR PIPPO PLUTO 31337 1010
------- ===========
1 2
Try this instead:
$ sed -re 's/ ([A-Z])/-\1/g'
Note that this doesn't have overlapping matches:
FOO BAR PIPPO PLUTO 31337 1010
-- == --
1 2 3
OTHER TIPS
sed 's/ \([^0-9]\)/-\1/g'
Just look for space followed by not a number and replace that space with a -
. The advantage of this is that it will work for lines that have non-alphanumeric characters.
Proof of Concept
$ cat ./infile
FOO BAR PIPPO PLUTO 31337 1010
FOOZ BAZ 130
VERY LONG LINE LIKE THIS THEN A NUMBER LIKE 42
THIS LINE HAS $ODD$ #CHARS# IN %IT% 42
$ sed 's/ \([^0-9]\)/-\1/g' ./infile
FOO-BAR-PIPPO-PLUTO 31337 1010
FOOZ-BAZ 130
VERY-LONG-LINE-LIKE-THIS-THEN-A-NUMBER-LIKE 42
THIS-LINE-HAS-$ODD$-#CHARS#-IN-%IT% 42
Very close. You don't need to match more than one letter though - you just want letter space letter:
sed -Ee 's/([A-Z])( )([A-Z])/\1-\3/g' foo.txt
FOO-BAR-PIPPO-PLUTO 31337 1010
FOOZ-BAZ 130
VERY-LONG-LINE-LIKE-THIS-THEN-A NUMBER-LIKE 42
(sed params adjusted for BSD sed)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow