Backreferences to constituents of a group consisting of a fixed number of repetitions
-
14-12-2019 - |
Question
I want to find a group that is repeated x times after each other, eg, five times a letter-digit combo separated by a space. I can use a simple repetition syntax, eg (?:\w\d ){5}
.
I then want to replace the space in this 5x letter-digit with something else. For this, I try to backreference each of the letter-digit combos (without the space) by placing parentheses around it: (?:(\w\d) ){5}
. Unfortunately, all five are store in $1
, ie, $1
gets overwritten every time it matches.
So, is there a way to avoid this overwriting? Or is there a way to replace something only in a substring?
EDIT:
Example input string: A1 A3 A4 B6 ::: A1 A3 A4 C5 B6
Desired output string: A1 A3 A4 B6 ::: A1-A3-A4-C5-B6
That means, replace the space only if there are five of them. Implemented in Perl.
Solution
If you just want to solve the problem, something like this works
$string = 'A1 A3 A4 B6 ::: A1 A3 A4 C5 B6';
$string =~ s/(\w\d(?: \w\d){4})/$_=$1; tr{ }{-}; $_/eg;
print "'$string'\n";
Otherwise, group repetition in Perl does overwrite the capture buffer every loop.
I don't know if another programatic way is possible.
edit
If you want to cover multiple spaces between character, add a + quantifier and the tr///s
- squash duplicate replacements in tr///.
s/(\w\d(?: +\w\d){4})/$_=$1; tr{ }{-}s; $_/eg;
If you have fancier replacements you can always double up the regex with a callback style
equivalent
$string =~ s/(\w\d(?: +\w\d){4})/fixspaces($1)/eg;
sub fixspaces {
my $buf = shift;
$buf =~ s/ +/-/g;
$buf;
}
OTHER TIPS
It's ugly and inflexible, but for your sample input, if it really is always five, and if your sample input never varies, this should work:
s/(\w\d) +(\w\d) +(\w\d) +(\w\d) +(\w\d) */$1-$2-$3-$4-$5/
This works:
#!usr/bin/perl
sub substitute{
$substr=shift;
$substr=~s/\s/-/gi;
return $substr;
}
$test="hello a1 b2 c3 d4 e5 testing";
$test=~s/((?:\w\d\s){4})(\w\d)\s/&substitute($1).$2." "/egi;
print $test;