The most performant way to validate XML against XSD
-
01-10-2019 - |
Question
I get a string variable with XML in it and have a XSD file. I have to validate the XML in the string against the XSD file and know there is more than one way (XmlDocument, XmlReader, ... ?).
After the validation I just have to store the XML, so I don't need it in an XDocument or XmlDocument.
What's the way to go if I want the fastest performance?
Solution
Others have already mentioned the XmlReader
class for doing the validation, and I wont elaborate further into that.
Your question does not specify much context. Will you be doing this validation repeatedly for several xml documents, or just once? I'm reading a scenario where you are just validating a lot of xml documents (from a third party system?) and storing them for future use.
My contribution to the performance hunt would be to use a compiled XmlSchemaSet
which would be thread safe, so several threads can reuse it without needing to parse the xsd document again.
var xmlSchema = XmlSchema.Read(stream, null);
var xmlSchemaSet = new XmlSchemaSet();
xmlSchemaSet.Add(xmlSchema);
xmlSchemaSet.Compile();
CachedSchemas.Add(name, xmlSchemaSet);
OTHER TIPS
I would go for the XmlReader with XmlReaderSettings because does not need to load the complete XML in memory. It will be more efficient for big XML files.
I think the fastest way is to use an XmlReader that validates the document as it is being read. This allows you to validate the document in only one pass: http://msdn.microsoft.com/en-us/library/hdf992b8.aspx
Use an XmlReader
configured to perform validation, with the source being a TextReader
.
You can manually specify the XSD the XmlReader
is to use if you don't want to rely on declarations in the input document (with XmlReaderSettings.Schemas
property)
A start (just assumes XSD-instance declarations in the input document) would be:
var settings = new XmlReaderSettings {
ConformanceLevel = ConformanceLevel.Document,
ValidationType = ValidationType.Schema,
ValidationFlags = XmlSchemaValidationFlags.ProcessSchemaLocation |
XmlSchemaValidationFlags.ProcessInlineSchema,
};
int warnings = 0;
int errors = 0;
settings.ValidationEventHandler += (obj, ea) => {
if (args.Severity == XmlSeverityType.Warning) {
++warnings;
} else {
++errors;
}
};
XmlReader xvr = XmlReader.Create(new StringReader(inputDocInString), settings);
try {
while (xvr.Read()) {
// do nothing
}
if (0 != errors) {
Console.WriteLine("\nFailed to load XML, {0} error(s) and {1} warning(s).", errors, warnings);
} else if (0 != warnings) {
Console.WriteLine("\nLoaded XML with {0} warning(s).", warnings);
} else {
System.Console.WriteLine("Loaded XML OK");
}
Console.WriteLine("\nSchemas loaded durring validation:");
ListSchemas(xvr.Schemas, 1);
} catch (System.Xml.Schema.XmlSchemaException e) {
System.Console.Error.WriteLine("Failed to read XML: {0}", e.Message);
} catch (System.Xml.XmlException e) {
System.Console.Error.WriteLine("XML Error: {0}", e.Message);
} catch (System.IO.IOException e) {
System.Console.Error.WriteLine("IO error: {0}", e.Message);
}