Question

I am new to phantomjs, Java script and WebScraping in General. What I want to do is basic http authentication and then visit another URL to get some information. Here is what I have till now. Please tell me what I am doing wrong.

var page = require('webpage').create();
var system = require('system');

page.onConsoleMessage = function(msg) {
   console.log(msg);
};

page.onAlert = function(msg) {
   console.log('alert!!>' + msg);
};

page.settings.userName = "foo";
page.settings.password = "bar";

page.open("http://localhost/login", function(status) {
    console.log(status);
    var retval = page.evaluate(function() {
       return "test";
    });
    console.log(retval);

    page.open("http://localhost/ticket/" + system.args[1], function(status) {
        if ( status === "success" ) {
            page.injectJs("jquery.min.js");
            var k = page.evaluate(function () {
                var a = $("div.description > h3 + p");

                if (a.length == 2) {
                    console.log(a.slice(-1).text())
                } 
                else {
                    console.log(a.slice(-2).text())
                }
            //return document.getElementById('addfiles');
            });

        }
    });
    phantom.exit();
});

I am passing an argument to this file: a ticket number which gets appended to the 2nd URL.

No correct solution

OTHER TIPS

I would recommend CasperJS highly for this.

CasperJS is an open source navigation scripting & testing utility written in Javascript and based on PhantomJS — the scriptable headless WebKit engine. It eases the process of defining a full navigation scenario and provides useful high-level functions, methods & syntactic sugar for doing common tasks such as:

  • defining & ordering browsing navigation steps
  • filling & submitting forms
  • clicking & following links
  • capturing screenshots of a page (or part of it)
  • testing remote DOM
  • logging events
  • downloading resources, including binary ones
  • writing functional test suites, saving results as JUnit XML
  • scraping Web contents

(from the CasperJS website)

I recently spent a day trying to get PhantomJS by itself to do things like fill out a log-in form and navigate to the next page.

CasperJS has a nice API purpose built for forms as well:

http://docs.casperjs.org/en/latest/modules/casper.html#fill

var casper = require('casper').create();

casper.start('http://some.tld/contact.form', function() {
    this.fill('form#contact-form', {
        'subject':    'I am watching you',
        'content':    'So be careful.',
        'civility':   'Mr',
        'name':       'Chuck Norris',
        'email':      'chuck@norris.com',
        'cc':         true,
        'attachment': '/Users/chuck/roundhousekick.doc'
    }, true);
});

casper.then(function() {
    this.evaluateOrDie(function() {
        return /message sent/.test(document.body.innerText);
    }, 'sending message failed');
});

casper.run(function() {
    this.echo('message sent').exit();
});
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top