Question

I am trying to validate some XML via lxml and an xsd (ogckml22.xsd). This is happening OFFLINE. I read ther file via a straight open/read

For the record, http://www.opengis.net/kml/2.2 is not valid.

from another article: (clarified due to comment request..)

from lxml import etree
import os
import sys
import StringIO
file=open('ogckml22.xsd')
data=file.read()
str=StringIO.StringIO(data)
try:
     xmlschema_doc=etree.parse(data)
except IOError as ex:
    print "oops {0}".format(ex.strerror)
except:
    print "Unexpected error:", sys.exc_info()[0]

xmlschema=etree.XMLSchema(xmlschema_doc)  

All I get is a "connection refused". With the try/except, I get the xmlschema_doc is not defined.

File "<stdin>", line 1, in <module>  
File "<xmlschema.pxi",line 105, in lxml.etree.XMLSchema.__init__ (src/lxml/lxml.etree.c:132748  
   self.error_log)  
lxml.etree.XMLSchemaParseError: connection refused  

I know it can read the xsd file above and another xsd file that gets included.

OK maybe the xsd gets read? I downloaded the source for lxml and in src/lxml/xmlschema.pxi,

if self._c_schema is NULL:
    raise XMLSchemaParseError(
        self.error_log._buildExceptionMessage(
            u"Document is not valid XML Schema"),
        self._error_log)

I never see the "Document is not valid XML Schema" message. I can only assume that "Connection Refused" is used in place of the "Document message" (a default?) but a more thorough reading of _error_log (outside of recompilation) evades me....

Sincerely,

ArrowInTree

Was it helpful?

Solution

ogckml22.xsd imports two other schema documents (atom-author-link.xsd and xAL.xsd):

<!-- import atom:author and atom:link -->
<import namespace="http://www.w3.org/2005/Atom" 
        schemaLocation="atom-author-link.xsd"/>

<!-- import xAL:Address -->
<import namespace="urn:oasis:names:tc:ciq:xsdschema:xAL:2.0" 
        schemaLocation="http://docs.oasis-open.org/election/external/xAL.xsd"/>

If you want to parse the schema offline, you need to have both these documents available locally and he paths given by schemaLocation must be correct.

The parsing and loading of the schema can be simplified (there is no need for StringIO):

from lxml import etree

xmlschema_doc = etree.parse("ogckml22.xsd") 
xmlschema = etree.XMLSchema(xmlschema_doc)

print xmlschema

Output:

<lxml.etree.XMLSchema object at 0x00D25120>

I don't understand what you mean by "For the record, http://www.opengis.net/kml/2.2 is not valid".

If you have internet access, you can use the URL as argument to etree.parse():

xmlschema_doc = etree.parse("http://www.opengis.net/kml/2.2")

At least this works for me.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top