Scrape Data within HTML Tags Perl

https://stackoverflow.com/questions/17646417

html
web-scraping
perl
www-mechanize

03-06-2022
|

Question

I'm writing a web scraper, and am a Perl novice. I'm using HTML::TreeBuilder to get the data I need, but I've run into a case I'm not sure how to handle. Here's some sample HTML:

<div class="anything" val="20" name="matchup">someUniqueData</div>

I want to extract the val from this HTML tag. I've been using findvalues() to do most of my work, but I don't know if this can pull data from inside tags. I've glossed over the documentation unsuccessfully. Is there a simple solution for this type of scrape?

Solution

You need (using HTML::TreeBuilder::XPath):

my ($val) = $tree->findvalues('//div[@class="anything"]/@val');

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow