Frage

I'm trying to detect if a link is broken or not, as in if it's a web address I could paste into my browser and find a web page. I've tried two methods so far that I found online and both are giving me false positives (LWP::UserAgent and LWP::Simple).

#!/usr/bin/perl -w

use strict;
use LWP::UserAgent;

my $url1 = 'http://www.gutenberg.org';
my $url2 = 'http://www.gooasdfzzzle.com.no/thisisnotarealsite';


my $ua = LWP::UserAgent->new;
$ua->agent("Mozilla/8.0");  # Pretend to be Mozilla

my $req = HTTP::Request->new(GET => "$url1");
my $res = $ua->request($req);

if ($res->is_success) {
    print "Success!\n";
} else {
    print "Error: " . $res->status_line . "\n";
}

$req = HTTP::Request->new(GET => "$url2");
$res = $ua->request($req);

if ($res->is_success) {
    print "Success!\n";
} else {
    print "Error: " . $res->status_line . "\n";
}

Which is giving me output of:

Success!
Success!

and then there's

#!/usr/bin/perl -w

use strict;
use LWP::Simple;

my $url1 = 'http://www.gutenberg.org';
my $url2 = 'http://www.gooasdfzzzle.com.no/thisisnotarealsite';

if (head("$url1")) {
    print "Yes\n";
} else {
    print "No\n";
}

if (head("$url2")) {
    print "Yes\n";
} else {
    print "No\n";
}

Which is giving me an output of:

Yes
Yes

Am I missing something here?

War es hilfreich?

Lösung

Your code worked fine for me, I can only see a problem if your running behind a VPN or gateway as previous stated. Always use strict and warnings, and here is an alternative way so you are not initializing a new Request object everytime you want to check for a valid link.

use strict;
use warnings; 
use LWP::UserAgent; 

sub check_url { 
  my ($url) = @_; 
  my $ua = LWP::UserAgent->new; 
  my $req = HTTP::Request->new(HEAD => $url);
  my $res = $ua->request($req); 
  return $res->status_line if $res->is_error;
  return "Success: $url"; 
} 
Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top