Question

Following this page scraping tutorial the author gets a collection of all images on the page as follows:

css :: ArrowXml a => String -> a XmlTree XmlTree
css tag = multi (hasName tag)

images tree = tree >>> css "img" >>> getAttrValue "src"

How can I only get, say, the 2nd image on the page? I couldn't find any sort of function like getElementAt :: Int -> blah in the XmlArrow docs.

Thanks!

Was it helpful?

Solution

Functions for manipulating lists of elements can be found in the ArrowList type-class.

In this particular case, you can use the >>. operator to transform the result list using ordinary list functions.

nthImage n tree = images tree >>. (take 1 . drop n)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top