How to download all pics in this website: huaban.com [closed]

https://stackoverflow.com/questions/12248647

29-06-2021
|

Domanda

I want to use a script to get all those pictures in this website. I viewed the source code of he main page with chrome developer tools. These image urls are like

src="http://img.hb.aicdn.com/3e32a8b101e515b9e7dbe8f5a2e47afff5ec6bcf4e4a-OTvsuu_fw192

But if i use wget or curl to download this page or even "Save pave" in browser, there is no such link in that html file. I don't know how to get all those links. Another problem is if we scroll down the page, images comes out continuous. I don't know if there is any way to get the whole page.

Soluzione

Can you please post the URL to the final page where you want to donwload all pics?

Or to you mean all images from the http://huaban.com/ landing page?

With the following code you can 'save' the image url into a file on your filesystem:

image_path = 'http://img.hb.aicdn.com/3e32a8b101e515b9e7dbe8f5a2e47afff5ec6bcf4e4a-OTvsuu_fw192'
with open(r'<path_to_file>.jpg', 'wb') as image:
    image.write(urllib2.urlopen(image_path).read())

But to retrieve the image 'source' pathes ... I fear they will be generated by the javascript components, so you have not much alternatives.

Maybe one solution could be to use a headless browser or JavaScript Engine bridge like Python-Spidermonkey to get the final (js-buidled) html-content.

-Colin-

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow