How to download all pics in this website: huaban.com [closed]
-
29-06-2021 - |
Вопрос
I want to use a script to get all those pictures in this website. I viewed the source code of he main page with chrome developer tools. These image urls are like
src="http://img.hb.aicdn.com/3e32a8b101e515b9e7dbe8f5a2e47afff5ec6bcf4e4a-OTvsuu_fw192
But if i use wget
or curl
to download this page or even "Save pave" in browser, there is no such link in that html file. I don't know how to get all those links. Another problem is if we scroll down the page, images comes out continuous. I don't know if there is any way to get the whole page.
Решение
Can you please post the URL to the final page where you want to donwload all pics?
Or to you mean all images from the http://huaban.com/ landing page?
With the following code you can 'save' the image url into a file on your filesystem:
image_path = 'http://img.hb.aicdn.com/3e32a8b101e515b9e7dbe8f5a2e47afff5ec6bcf4e4a-OTvsuu_fw192'
with open(r'<path_to_file>.jpg', 'wb') as image:
image.write(urllib2.urlopen(image_path).read())
But to retrieve the image 'source' pathes ... I fear they will be generated by the javascript components, so you have not much alternatives.
Maybe one solution could be to use a headless browser or JavaScript Engine bridge like Python-Spidermonkey to get the final (js-buidled) html-content.
-Colin-