Question

I am learning haskell arrows in case of parsing simple html page. The task is to download site of base region baseRegion = Region "Yekaterinburg" "http://example.com/r/ekb", parse links to another regions (via hxt):

regions :: ArrowXml cat => cat a (NTree XNode) -> cat a Region
regions tree =
  tree >>> multi (hasName "a" >>> hasAttrValue "class" (== ".regionlink")) >>>
    proc x -> do
      rname <- getText <<< getChildren -< x
      rurl <- getAttrValue "href" -< x
      returnA -< Region rname rurl

and append a base region to the result:

allRegions :: ArrowXml cat => cat a (NTree XNode) -> cat a Region
  1. How to write allRegions? Or, better, where should I dig to write it?
  2. Another question is how to not only append regions's result but insert baseRegion to some particular place of regions list (for example after the second element or after an element whose name is starting with 'E')?
Was it helpful?

Solution

I think the combinator you are looking for is (>>.) in the ArrowList type-class. It allows you to apply any list function on the arrow. E.g. prepending an element to the front of the arrow would be.

regions tree >>. (baseRegion:)

So as for your second question, you can write an utility function to insert the region in the list to the correct spot e.g. something with a signature like

insertRegion :: Region -> [Region] -> [Region]

and then you can use it on the arrow

regions tree >>. insertRegion baseRegion

Btw I would personally remove the tree parameter from your regions function and just use explicit arrow chaining so the above becomes.

tree >>> regions >>. insertRegion baseRegion
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top