Pregunta

I am working with a set of links from WWW::Mechanize and would like to simply print a list of the link names and url's that were retrieved from a page. Oddly, the names, pulled up with $link->name(), are coming back with '?' characters where spaces would be. I have tried to fix this using the following methods:

1)

{
my $name = $link->text();
$name =~ s/\?/" "/g;
}

2) As suggested in other posts on replacing the '?' character:

{
my $name = $link->text();
my $pat = quotemeta '?';
$name =~ s/$pat/" "/g;
}

Both methods do nothing to the $name string! What am I doing wrong here? Thanks!

¿Fue útil?

Solución

Per ikegami, in comment thread:

The reason a ? is being shown is an encoding issue, not because you actually have a ?. So the first things to do is find out what you actually have. So again I ask, what's the output of use Data::Dumper; { local $Data::Dumper::Useqq = 1; print(Dumper($name)); }.

I get '\240' for the ? chars. Tried again using 'quotemeta "\240";' and that fixes the problem!

240 octal is A0 hex, which is the NBSP, not a question. That's why removing ? didn't help. s/\xA0/ /g would help, but better yet, let's encode the string correctly for your terminal instead.

use open ':std', ':encoding(UTF-8)';
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top