سؤال

I have been trying to obtain the 301/302 redirect location from the http response using perl Mechanize (WWW::Mechanize), however have been having problems extracting it from the response using things like response->header and so on.

Can anyone help with extracting the redirect location from the http responses from websites that use 301 or 302 redirects please?

I know what I want to do and how to do it once I have this redirection location URL as I have done more complex things with Mechanize before, but I'm just having real problems with getting the location (or any other response fields) from the http response.

Your help would be much appreciated, Many thanks, CM

هل كانت مفيدة؟

المحلول

If its a redirect, WWW::Mechanize would use $mech->redirect_ok(); while request()ing to follow the redirect URL (this is an LWP method).

Note -

WWW::Mechanize's constructor pushes POST on to the agent's requests_redirectable list

So you wouldn't have to worry about pushing POST to the requests_redirectable list.

If you want to be absolutely certain that the program is redirecting your URLs and log every redirect in a log file (or something), you can use LWP's simple_request and HTTP::Response's is_redirect to detect redirects, something like this-

use WWW::Mechanize; 

$mech = WWW::Mechanize->new();  
$mech->stack_depth(0);

my $resp = $mech->simple_request( HTTP::Request->new(GET => 'http://www.googl.com/') );
if( $resp->is_redirect ) {
  my $location = $resp->header( "Location" );
  my $uri = new URI( $location );
  print "Got redirected to URL - $uri\n";    
  $mech->get($uri);
  print $mech->content;
}

is_redirect will detect both 301 and 302 response codes.

نصائح أخرى

WWW::Mechanize should automatically follow redirects (unless you've told it not to via requests_redirectable), so you should not need to do anything.

EDIT: just to demonstrate:

DB<4> $mech = WWW::Mechanize->new;

DB<5> $mech->get('http://www.preshweb.co.uk/linkedin');

DB<6> x $mech->uri;
0  URI::http=SCALAR(0x903f990)
  -> 'http://www.linkedin.com/in/bigpresh'

... as you can see, WWW::Mechanize followed the redirect, and ended up at the destination, automatically.

Updated with another example as requested:

DB<15> $mech = WWW::Mechanize->new;

DB<16> $mech->get('http://jjbsports.com/');

DB<17> x $mech->uri;
0  URI::http=SCALAR(0x90988f0)
 -> 'http://www.jjbsports.com/'
DB<18> x substr $mech->content, 0, 40;
0  '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML'
DB<19> x $mech->title;
0  'JJB Sports | Trainers, Clothing, Football Kits, Football Boots, Running'

As you can see, it followed the redirect, and $mech->content is returning the content of the page. Does that help at all?

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top