How to scrape "data:image" URIs; Encountering Errno::ENAMETOOLONG?

Question

I having been trying to write a script that scrapes a page for images the way it has been outlined in "Save all image files from a website".

I tested that method with another page and it worked fine, but when inserting my link to scrape data:image URIs, which look like:

data:image/jpg;base64,/9j/4FEJFOIEJNFOEJOIAD//gAQTGFGRGREGg2LjEwMAD/2wBDAAgEBAQEREGREWGRWEGUFBQYGBgYGBgYGB...

I get an error beginning with initialize': File name too long and ending in (Errno::ENAMETOOLONG).

Has anyone found a way to deal with situations like this?

Solution

data:image URLs actually contain the image inline as base 64. All you need to do is grab that data and decode it:

require 'base64'

File.open(File.basename(uri),'wb'){ |f| f.write(Base64.decode64(url[/base64,(.*)/, 1])) }

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow