Вопрос

I am using Nokogiri to parse a XML document and want to output a list of locations where the product name matches a string.

I'm able to output a list of all product names or a list of all locations but I'm not able to compare the two. Removing the if portion of the statement correctly outputs all the locations. What am I doing wrong with my regex?

@doc = Nokogiri::HTML::DocumentFragment.parse <<-EOXML
<?xml version="1.0"?>
<root>
<product>
  <name>cool_fish</name>
  <product_details>
    <location>ocean</location>
    <costs>
      <msrp>9.99</msrp>
      <margin>5.00</margin>
    </costs>
  </product_details>
</product>
<product>
  <name>veggies</name>
  <product_details>
    <location>field</location>
    <costs>
      <msrp>2.99</msrp>
      <margin>1.00</margin>
    </costs>
  </product_details>
</product>    
</root>
EOXML

doc.xpath("//product").each do |x|
  puts x.xpath("location") if x.xpath("name") =~ /cool_fish/
end
Это было полезно?

Решение 2

Write your code as below :

require 'nokogiri'

@doc = Nokogiri::XML <<-EOXML
<?xml version="1.0"?>
<root>
<product>
  <name>cool_fish</name>
  <product_details>
    <location>ocean</location>
    <costs>
      <msrp>9.99</msrp>
      <margin>5.00</margin>
    </costs>
  </product_details>
</product>
<product>
  <name>veggies</name>
  <product_details>
    <location>field</location>
    <costs>
      <msrp>2.99</msrp>
      <margin>1.00</margin>
    </costs>
  </product_details>
</product>    
</root>
EOXML


@doc.xpath("//product").each do |x|
    puts x.at_xpath(".//location").text  if x.at_xpath(".//name").text =~ /cool_fish/
end
# >> ocean

You are parsing an xml, you should use Nokogiri::XML. Your xpath expression was also incorrect. You wrote #xpath method, but you were using expression, which you should use with methods like css or search. I used at_xpath method, as you were interested with the single node match inside the #each block.

But you can use at in place of #at_xpath and search in place of xpath.

Remember search and at both understand CSS, as well as xpath expressions. search or xpath or css all methods will give you NodeSet, where as at, at_css or at_xpath would give you a Node. Once a Nokogiri node will be in your hand, use text method to get the content of that node.

Другие советы

A few things going on here:

  1. As others have pointed out, you should be parsing as XML not HTML, although that wouldn’t actually make much difference to the results you get.

  2. You are parsing as a DocumentFragment, you should parse as a complete document. There are some issues involved querying document fragments, in particular queries starting with // don’t work right.

  3. The location element is actually at the position product_details/location relative to the product node in your XML, so you need to update your query to take that into account.

  4. You are trying to use the =~ operator on the result of the xpath method which is a Nokogiri::XML::NodeSet. NodeSet doesn’t define a =~ method, so it uses the default one on Object that just returns nil, so it will never match. You should use at_xpath to only get the first result, and then call text on it to get the string that you can match using =~.

(Also you use @doc and doc, but I’m assuming that’s just a typo.)

So combining those four points your code will look like:

#parse using XML, and not a fragment
doc = Nokogiri::XML <<-EOXML
  # ... XML elided for space
EOXML

doc.xpath("//product").each do |x|
  # correct query, use at_xpath and call text method
  puts x.at_xpath("product_details/location") if x.at_xpath("name").text =~ /cool_fish/
end

However in this case you could do it all in a single XPath query, using the contains function:

# parse doc as XML document as above
puts doc.xpath("//product[contains(name, 'cool_fish')]/product_details/location")

This works because you have a fairly simple regex that only checks against a literal string. XPath 1.0 doesn’t have support for regex, so if your real use case involves a more complex one you may need to do it the “hard way”. (You could write a custom XPath function in that case, but that’s another story.)

I would suggest using Nokogiri::XML instead

@doc = Nokogiri::XML::Document.parse <<-EOXML
<?xml version="1.0"?>
<root>
<product>
  <name>cool_fish</name>
  <product_details>
    <location>ocean</location>
    <costs>
      <msrp>9.99</msrp>
      <margin>5.00</margin>
    </costs>
  </product_details>
</product>
<product>
  <name>veggies</name>
  <product_details>
    <location>field</location>
    <costs>
      <msrp>2.99</msrp>
      <margin>1.00</margin>
    </costs>
  </product_details>
</product>    
</root>
EOXML

and then the Nokogiri::Node#search and Nokogiri::Node#at methods

@doc.search("product").each do |x|
  puts x.at("location").content if x.at("name").content =~  /cool_fish/
end
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top