Вопрос

I am parsing a log using Perl and I am stumped with as to how I can parse something like this:

from=[ihatethisregex@hotmail.com]
from=ihatethisregex@hotmail.com

What I need is ihatethisregex@hotmail.com and I need to capture this in a named capture group called "email".

I tried the following:

(?<email>(?:\[[^\]]+\])|(?:\S+))

But this captures the square brackets when it parses the first line. I don't want the square brackets. Was wondering if I could do something like this:

(?:\[(?<email>[^\]]+)\])|(?<email>\S+)

and when I evaluate $+{email}, it will just take whichever one that was matched. I also tried the following:

(?:\[?(?<email>(?:[^\]]+\])|(?:\S+)))

But this gave strange results when the email was wrapped in a pair of square brackets.

Any help is appreciated.

Это было полезно?

Решение

/(\[)?your-regexp-here(?(1)\]|)/

 (  )                              capture group #1
  \[                                 opening bracket
     ?                                 optionally
      your-regexp-here             your regexp
                      (?( )   )    conditional match:
                         1           if capture group #1 evaluated,
                           \]          closing bracket
                             |       else nothing

Note that this does not work in all languages, since conditional match is not a part of a standard regular expression, but rather an extension. Works in Perl, though.

EDIT: misplaced question mark.

Другие советы

I tend to do these kinds of things in two steps, just because its clearer:

my ($val)= /\w+=(.*)/ ;
$val =~ s/\[(.*)\]/$1/e ;

This trims off [] seperately.

Perhaps the following will be helpful:

use strict;
use warnings;

while (<DATA>) {
    /from\s*=\s*\[?(?<email>(?:[^\]]+))\]?/;
    print $+{email}, "\n";
}

__DATA__
from=[ihatethisregex@hotmail.com]
from=ihatethisregex@hotmail.com

Output:

ihatethisregex@hotmail.com
ihatethisregex@hotmail.com
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top