Question

I just begin to learn some Perl based web application, however, I encounter an problem, I try to write a Perl script to download a file, which need input a code before download it. as an example, please see this url: http://epaper.dfdaily.com/dfzb/page/1/2013-08/17/A01/20130817A01_pdf.pdf.

I tried to google somehow i choose to use WWW::Mechanize, as below code, but I can't get the file. Anyone can help me on this? much thanks!!

my code here: (I suppose already get the correct code and store it to $code):

my $mech = WWW::Mechanize -> new();
$mech -> get($url);
$mech -> submit_form(
         form_number => 0,
         fields => {checkCode => $code}
     );

print $mech -> content;
Was it helpful?

Solution

This is sample code to demonstrate how you can do it. The code will create captcha.jpg file in the programs directory so you can check it and input CAPTCHA after that:

use strict;
use warnings;
use FindBin qw($Bin);
#use HTML::TreeBuilder::XPath;
use WWW::Mechanize;

my $mech = WWW::Mechanize->new();
$mech->agent_alias("Windows IE 6");
$mech->get(
    "http://epaper.dfdaily.com/dfzb/page/1/2013-08/17/A01/20130817A01_pdf.pdf");

#you don't need commented code
#because CAPTCHA URL is always the same for this site
#my $tree = HTML::TreeBuilder->new_from_content( $mech->content() );
#my ($src) = $tree->findvalues('//img[@id="checkcode"]');
$mech->get("http://203.156.244.168:9000/validatecodegen");
open my $fh, ">:raw", "$Bin/captcha.jpg" or die $!;
print {$fh} $mech->content();
close $fh;
$mech->back();

print "Input CAPTCHA: ";
my $code = <>;
chomp $code;
$mech->submit_form(

    with_fields => {
        checkCode => $code,

    },
    button => "Submit",
);

$mech->save_content("$Bin/result.pdf");
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top