Extracting a Link using Mechanize in Ruby

https://stackoverflow.com/questions/22140189

19-10-2022
|

Вопрос

I'm trying to extract a link from an element (.jobtitle a) using mechanize. I'm trying to do that in the link variable below. Anyone know how?

require 'rubygems'
require 'mechanize'

agent = Mechanize.new
page = agent.get('http://id.indeed.com/')
indeed_form = page.form('jobsearch')
indeed_form.q = ''
indeed_form.l = 'Indonesia'
page = agent.submit(indeed_form)
page.search(".row , .jobtitle a").each do |job|
    job_title = job.search(".jobtitle a").map(&:text).map(&:strip)
    company = job.search(".company span").map(&:text).map(&:strip)
    date = job.search(".date").map(&:text).map(&:strip)
    location = job.search(".location span").map(&:text).map(&:strip)
    summary = job.search(".summary").map(&:text).map(&:strip)
    link = job.search(".jobtitle a").map(&:text).map(&:strip)
end

Нет правильного решения

Другие советы

I don't think you can select attributes with css paths.

From the mechanize documentation:

search()

Search for paths in the page using Nokogiri's search. The paths can be XPath or CSS and an optional Hash of namespaces may be appended.

See Nokogiri::XML::Node#search for further details.

You should check out XPaths instead. See e.g.:

Getting attribute using XPath

http://www.w3schools.com/xpath/

You may need to rewrite the way you iterate through the page.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow