Question

The REXML module appears to have support for RELAX NG validation, but the docs don't have any real information on using the validation portion of the framework.

How would you validate an XML document with a RELAX NG schema? A code snippet would be most helpful. TIA!

Was it helpful?

Solution

Well, I got a program constructed but the results aren't good.

My conclusions are as follows:

  1. rexml relaxng schema parsing probably does not work. the code notes it is incomplete
  2. rexml pull parsing probably works but hard to tell
  3. both of the above are undocumented
  4. you should use a real XML library such as libxml

Here's my test program: test.rb

require 'rexml/validation/relaxng.rb'
require 'rexml/parsers/pullparser.rb'

# USAGE: ruby test.rb XML-FILE
xml = ARGV[0]

# schema must be a Relax NG XML (NOT compact / .rnc)
schema = File.new( "example.rng" )
validator = REXML::Validation::RelaxNG.new( schema )

# The structure the validator made, which should be a complex structure but isn't
validator.dump

xmlfile = File.new( xml )
parser = REXML::Parsers::PullParser.new( xmlfile )
while parser.has_next?
  # Returns an PullEvent
  e = parser.pull
  # puts "Event ", e.inspect
  validator.validate(e)
end

and I made some toy example XML files and RNG files and then tried it out on OSX 10.5.x (long line broken to make it readable):

$ /usr/bin/ruby test.rb good.xml 
< S.1 #{doc}, :end_document(  ) >
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/rexml/
  validation/validation.rb:24:in `validate': Validation error.  Expected:
  :start_element( doc ) from < S.1 #:start_element( doc ), {head}, {body},
  :end_element(  ), :end_document(  ) >  but got "doc"(  )
  (REXML::Validation::ValidationException)
        from test.rb:20

(I get the same with 1.9)

So, pretty much failure.

(I could have optimized the test program some more to use add_listener but it didn't seem worthwhile)

OTHER TIPS

I've had success with Nokogiri (after switching from the libxml-ruby gem, since it segfault'ed every time with v1.1.3, although the changelog says that some Windows segfault issues have been resolved).

Here's the code I'm using :

First off, install Nokogiri, take a look at the installation tutorial if you're having issues.

gem install nokogiri

If running on Rails, config the gem in your "Rails.root/config/enviroment.rb", for instance :

config.gem 'nokogiri'

Conversely, just require "nokogiri if running Ruby.

To validate an XML document based on a pre-defined RelaxNG schema (we'll assume the files are stored in 'public'), use this snippet :

schema_path = "public/mySchema.rng"    # Or any valid path to a .RNG File
doc_path    = "public/myInstance.xml"  # Or any valid path to a .XML File

schema = Nokogiri::XML::RelaxNG(File.open(schema_path))

instance = Nokogiri::XML(File.open(doc_path))
errors = schema.validate(instance)

is_valid = errors.empty?

Hope this helps !

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top