Frage

Is there a simple method/way to check if a Nokogiri XML file has a proper root, like xml.valid? A way to check if the XML file contains specific content is very welcome as well.

I'm thinking of something like xml.valid? or xml.has_valid_root?. Thanks!

War es hilfreich?

Lösung

How are you going to determine what is a proper root?

<foo></foo>

has a proper root:

require 'nokogiri'

xml = '<foo></foo>'
doc = Nokogiri::XML(xml)
doc.root # => #<Nokogiri::XML::Element:0x3fd3a9471b7c name="foo">

Nokogiri has no way of determining that something else should have been the root. You might be able to test if you have foreknowledge of what the root node's name should be:

doc_root_ok = (doc.root.name == 'foo')
doc_root_ok # => true

You can see if the document parsed was well-formed (not needing any fixup), by looking at errors:

doc.errors # => []

If Nokogiri had to modify the document just to parse it, errors will return a list of changes that were made prior to parsing:

xml = '<foo><bar><bar></foo>'
doc = Nokogiri::XML(xml)
doc.errors # => [#<Nokogiri::XML::SyntaxError: Opening and ending tag mismatch: bar line 1 and foo>, #<Nokogiri::XML::SyntaxError: Premature end of data in tag bar line 1>, #<Nokogiri::XML::SyntaxError: Premature end of data in tag foo line 1>]

Andere Tipps

A common and useful pattern is

doc = Nokogiri::XML(xml) do |config|
  config.strict
end

This will throw a wobbly if the document is not well formed. I like to do this in order to prevent Nokogiri from being too kind to my XML.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top