Regular expression matching a number of characters which amount had been parsed before

https://stackoverflow.com/questions/6069891

07-09-2020
|

Question

Say I have a file that contains lines that look like this:

"7sunrIsEfoo"
"10ecological09"
"3bedtime"

Each line starts with numeral(s) that represent number n. I want to match the n characters following it. The output should be:

sunrIsE
ecological
bed

Is there a way to do this using a regular expression? My first attempt was:

([0-9]*)[a-zA-Z]{\1}

but it doesn't seem to work.

Solution

That's not possible with regex.

([0-9]*) just "remembers" the digits as a substring: they can't be used numerically.

OTHER TIPS

In Ruby, you could use:

result = string[/(\d+)([a-zA-Z]+)/,2][0,$1.to_i]

It will give you the expected result.

Regular expressions are not well suited for this task. They have no built it way to interpret numbers. You could use a (messy) workaround by using

(?<=1).|(?<=2)..|(?<=3)...|

and so on (Though you'd better use the reverse order, otherwise you'll have problems when reaching 11). Note that you should not use this method unless you really really have no other way =)

Here is a way to do it in Perl:

#!/usr/local/bin/perl
use strict;
use warnings;

my @list = ("7sunrIsEfoo", "10ecological09", "3bedtime");
foreach(@list) {
    s/^(\d+)(.+)$/substr($2, 0, $1)/e;
    print $_,"\n";
}

output:

sunrIsE
ecological
bed

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow