Regular expression matching a number of characters which amount had been parsed before
Question
Say I have a file that contains lines that look like this:
"7sunrIsEfoo"
"10ecological09"
"3bedtime"
Each line starts with numeral(s) that represent number n. I want to match the n characters following it. The output should be:
sunrIsE
ecological
bed
Is there a way to do this using a regular expression? My first attempt was:
([0-9]*)[a-zA-Z]{\1}
but it doesn't seem to work.
Solution
That's not possible with regex.
([0-9]*)
just "remembers" the digits as a substring: they can't be used numerically.
OTHER TIPS
In Ruby, you could use:
result = string[/(\d+)([a-zA-Z]+)/,2][0,$1.to_i]
It will give you the expected result.
Regular expressions are not well suited for this task. They have no built it way to interpret numbers. You could use a (messy) workaround by using
(?<=1).|(?<=2)..|(?<=3)...|
and so on (Though you'd better use the reverse order, otherwise you'll have problems when reaching 11). Note that you should not use this method unless you really really have no other way =)
Here is a way to do it in Perl:
#!/usr/local/bin/perl
use strict;
use warnings;
my @list = ("7sunrIsEfoo", "10ecological09", "3bedtime");
foreach(@list) {
s/^(\d+)(.+)$/substr($2, 0, $1)/e;
print $_,"\n";
}
output:
sunrIsE
ecological
bed