Question

A contractor has provided us with survey data for a set of stores. The data contains the store numbers, thumbnail images and large images. The data is accessed through the contractor's secured website. In order to build a report for the data, I am trying to scrape the store numbers and images from the site instead of manually downloading each image.

I have not used CFhttp for secured sites, but have had a little success so far with:

<cfhttp 
    method="post" 
    url="http://www.website.com/impart/client_login.php"
    throwonerror="Yes"
    redirect = "yes"
    resolveUrl = "yes">

    <cfhttpparam name="user" value="myUsername" type="formfield">
    <cfhttpparam name="pass" value="myPassword" type="formfield">
    <cfhttpparam name="submit" value="Login" type="formfield">

How do I proceed from getting passed the authentication to the page that contains the image to download?

Était-ce utile?

La solution

What does the dump of cfhttp scope look like? Specifically, what is the status code?

If you get a status code of 200, you'll need to maintain the session as you grab each image. See the following:

http://www.bennadel.com/blog/725-Maintaining-Sessions-Across-Multiple-ColdFusion-CFHttp-Requests.htm

http://www.bennadel.com/projects/cfhttp-session.htm

See this question for saving images via CFHTTP:

Convert an image from CFHTTP filecontent to binary data with Coldfusion

Autres conseils

I think that CFHTTP may not be the best choice for this. I am good at BASH, so I would tend towards scripting it with curl, but maybe some product on this page would be easier http://www.timedicer.co.uk/web-scraping ?

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top