Question

I have an XML file which looks like follows, that I need to validate.

<?xml version="1.0" encoding="iso-8859-1"?>
    <MyAttributes
      Att1="00:00:00"
      Att2="00:05:00"
      Att3="00:05:00"
      Att4="foo,bar,true,true,,,0253d1f0-27d6-4d90-9d35-e396007db787"
      Att5="abc,def,false,true,,,4534234-65d6-6590-5535-da2007db787"
      ....
      ..../>

I want to validate the xml file using XSD schema files as follows.

MyAttributes contains Att1, Att2 and Att3 2. Values of Att1, Att2 and Att3 are of the type TimeSpan 3. All the other attributes in MyAttributes have the belwo format.

  1. Format of all the other attributes are as follows csv format with 7 columns
    first and second columns should be non-empty strings col3 and col4 should be boolean
    col5 and col6 are strings.can be empty col7 should be of type GUID

Is there a way I can validate this with some kind of regex assertion using XSD 1.1?

Was it helpful?

Solution

The xs:time type will validate the timespan fields. For the other fields, you can use a restriction to the xs:string type with a regexp. This XSD will validate the example XML you posted:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
    <xs:simpleType name="CsvType">
        <xs:restriction base="xs:string">
            <xs:pattern value="\w+,\w+,(true|false),(true|false),\w*,\w*,[A-Fa-f0-9]{7,8}(-[A-Fa-f0-9]{4}){3}-[A-Fa-f0-9]{11,12}"></xs:pattern>
        </xs:restriction>
    </xs:simpleType>
    <xs:element name="MyAttributes">
        <xs:complexType>
            <xs:attribute name="Att1" type="xs:time" />
            <xs:attribute name="Att2" type="xs:time" />
            <xs:attribute name="Att3" type="xs:time" />
            <xs:attribute name="Att4" type="CsvType" />
            <xs:attribute name="Att5" type="CsvType" />
        </xs:complexType>
    </xs:element>
</xs:schema>

You don't really need XSD 1.1 assertions, unless you want to validate contents of one attribute in relation to the contents of the other.

OTHER TIPS

This regex validates your TimeSpan lines:

"(\d\d):(60|([0-5][0-9])):(60|([0-5][0-9]))"

Regular expression visualization

Debuggex Demo

If it matches, the line is valid. I got the regex from the first answer in this question.

And for your GUID lines, if this matches the line, then it's valid:

"(?:\w+,){2}(?:(?:true|false),){2}(?:\w*,){2}(?:[0-9a-fA-F]{7,8}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{11,12})"

Regular expression visualization

Debuggex Demo

Although the first GUID in your demo-input line matches the regex from the first answer in this question, the second one does not, because it has a different number of characters in certain elements. I changed it so it matches both.

You can use xs:anyAttribute to allow any attribute at all, but then you can't control the name or type of the attribute. You can only define the type for attributes that are explicitly named in the schema. As you suggest, to handle the general case you will need an XSD 1.1 assertion. This could be of the form:

test="every $a in @* satisfies (
        (name($a) = ('Att1', 'Att2', 'Att3') and $a castable as xs:time) or
        (matches(name($a), 'Att\d+') and matches($a, some-regex))"/>

where some-regex is the regular expression others have supplied, anchored with ^ at the start and $ at the end so it matches the whole string and not some substring.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top