Question

Is there a way I can run a script with a certain web page as the argument that will download all the images from there?

Was it helpful?

Solution

You can use an Automator workflow to download images embedded in a web page, or images linked from a web page. A good starting point for a workflow is:

  1. Get Current Webpage from Safari
  2. Get Image URLs from Webpage
  3. Download URLs

Downloading images from web pages with Automator on Mac OS X 10.8

You can change the workflow to use a list of web pages to fetch from.

Automator is included with Mac OS X in the Applications > Utilities folder.

OTHER TIPS

wget -nd -r -l1 -p -np -A jpg,jpeg,png,svg,gif -e robots=off http://www.apple.com/itunes/
  • -nd (no directories) downloads all files to the current directory
  • -r -l1 (recursive level 1) downloads linked pages and resources on the first page
  • -p (page requisites) also includes resources on linked pages
  • -np (no parent) doesn't follow links to parent directories
  • -A (accept) only downloads or keeps files with the specified extensions
  • -e robots=off ignores robots.txt and doesn't download a robots.txt to the current directory

If the images are on a different host or subdomain, you have to add -H to span hosts:

wget -nd -H -p -A jpg,jpeg,png,gif -e robots=off http://example.tumblr.com/page/{1..2}

You can also use curl:

cd ~/Desktop/; IFS=$'\n'; for u in $(curl -Ls http://example.tumblr.com/page/{1..2} | sed -En 's/.*src="([^"]+\.(jpe?g|png))".*/\1/p' | sort -u); do curl -s "$u" -O; done

-L follows location headers (redirects). -O outputs files to the current directory with the same names.

Here is a hacky solution (but it works). Hope someone can find a better one.

  1. In Terminal, use wget --page-requisites http://example.com/. This will download the webpage at example.com and all of the resources linked from it (such as images, stylesheets, and scripts). More info on --page-requisites. Note: You can add many URLs separated by spaces to download lots of them at once. If many are from the same server, you should use something like wget --wait=2 to avoid slurping down files too fast.

  2. Open the folder you downloaded those files to and use Spotlight to separate the images from the other files. I'm going to assume you have Mountain Lion. Type "Image" into the search field and select Kinds > Image.

If you know the pattern in the url, you could use the *ix solution with Curl: Use curl to download images from website using wildcard?

Check out the Automator Space on MyAppleSpace http://www.myapplespace.com/pages/view/14664/automator-script-library

Licensed under: CC-BY-SA with attribution
Not affiliated with apple.stackexchange
scroll top