Pergunta

In perl the matching not works but the transform works in LWP

for eg

if($aa =~ m/<div>(.*?)<\/div>/)

This is not working

but

if($aa =~ s/.*?about//s)

This works for LWP what was the exact problem

From comment:

Input data is a html web source page inbetween the page it will contain for eg

<html><div><span>23344<\/span><\/div><\/html> 

now i have to match 23344 with $aa=~ m/<span>(.*?)<\/span>
but this is not working instead transform works

$aa =~ s/(<html>.*?<span>)\d+(<\/span>.*?<\/html>)/$1/$2/ 

This transform that works #Robin

Foi útil?

Solução

Try this:

#!/usr/bin/perl

use warnings;
use strict;

my $aa = '<html><div><span>23344</span></div></html>';

if ($aa =~ m/<span>(.+)<\/span>/) {
    print "$1\n";
}

Outras dicas

Don't use regular expressions for parsing html. There are too many edges cases to be able to handle it effectively or efficiently.

Instead use an actual html parser like Mojo::DOM:

#!/usr/bin/perl

use warnings;
use strict;

use Mojo::DOM;

my $aa = '<html><div><span>23344</span></div></html>';

my $dom = Mojo::DOM->new($aa);
print $dom->find('html > div > span')->text;
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top