How do I create a reusable “US State” type in an XML schema?

https://stackoverflow.com/questions/227434

03-07-2019
|

Question

I have an XML schema that includes multiple addresses:

<xs:element name="personal_address" maxOccurs="1">
  <!-- address fields go here -->
</xs:element>
<xs:element name="business_address" maxOccurs="1">
  <!-- address fields go here -->
</xs:element>

Within each address element, I include a "US State" enumeration:

<xs:simpleType name="state">
    <xs:restriction base="xs:string">
        <xs:enumeration value="AL" />
        <xs:enumeration value="AK" />
        <xs:enumeration value="AS" />
                ....
            <xs:enumeration value="WY" />
        </xs:restriction>
</xs:simpleType>

How do I go about writing the "US State" enumeration once and re-using it in each of my address elements? I apologize in advance if this is a n00b question -- I've never written an XSD before.

My initial stab at it is the following:

<xs:element name="business_address" maxOccurs="1">
  <!-- address fields go here -->
  <xs:element name="business_address_state" type="state" maxOccurs="1"></xs:element>
</xs:element>

Solution

I think you are on the right tracks. I think its more to do with XML namespaces. Try the following:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
    targetNamespace="http://www.example.org/foo"
    xmlns:tns="http://www.example.org/foo"
    elementFormDefault="qualified">
    <xs:element name="business_address">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="business_address_state"
                    type="tns:state" maxOccurs="1" />
            </xs:sequence>
        </xs:complexType>
    </xs:element>
    <xs:simpleType name="state">
        <xs:restriction base="xs:string">
            <xs:enumeration value="AL" />
            <xs:enumeration value="AK" />
            <xs:enumeration value="AS" />
            <xs:enumeration value="WY" />
        </xs:restriction>
    </xs:simpleType>
</xs:schema>

Note that the type is tns:state not just state

And then this is how you would use it:

<?xml version="1.0" encoding="UTF-8"?>
<business_address xmlns="http://www.example.org/foo"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.example.org/foo foo.xsd ">
    <business_address_state>AL</business_address_state>
</business_address>

Notice that this XML uses a default namespace the same as the targetNamespace of the XSD

OTHER TIPS

While namespaces help keep schemas organized and prevent conflicts, it's not the namespace above that allows for the reuse, it's the placement of the type as an immediate child of the <xs:schema> root that makes it a global type. (Usable within the namespace w/o the namespace qualifier and from anywhere that the tns namespace is visible w/ the tns: qualifier.)

I prefer to construct my schemas following the "Garden of Eden" approach, which maximizes reuse of both elements and types (and can also facilitate external logical referencing of the carefully made unique type/element from, say, a data dictionary stored in a database.

Note that while the "Garden of Eden" schema pattern offers the maximum reuse, it also involves the most work. At the bottom of this post, I've provided links to the other patterns covered in the blog series.

• The Garden of Eden approach http://blogs.msdn.com/skaufman/archive/2005/05/10/416269.aspx

Uses a modular approach by defining all elements globally and like the Venetian Blind approach all type definitions are declared globally. Each element is globally defined as an immediate child of the node and its type attribute can be set to one of the named complex types.

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema targetNamespace="TargetNamespace" xmlns:TN="TargetNamespace" xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified" attributeFormDefault="unqualified">
    <xs:element name="BookInformation" type="BookInformationType"/>
    <xs:complexType name="BookInformationType">
        <xs:sequence>
            <xs:element ref="Title"/>
            <xs:element ref="ISBN"/>
            <xs:element ref="Publisher"/>
            <xs:element ref="PeopleInvolved" maxOccurs="unbounded"/>
        </xs:sequence>
    </xs:complexType>
    <xs:complexType name="PeopleInvolvedType">
        <xs:sequence>
            <xs:element name="Author"/>
        </xs:sequence>
    </xs:complexType>
    <xs:element name="Title"/>
    <xs:element name="ISBN"/>
    <xs:element name="Publisher"/>
    <xs:element name="PeopleInvolved" type="PeopleInvolvedType"/>
</xs:schema>

The advantage of this approach is that the schemas are reusable. Since both the elements and types are defined globally both are available for reuse. This approach offers the maximum amount of reusable content. The disadvantages are the that the schema is verbose. This would be an appropriate design when you are creating general libraries in which you can afford to make no assumptions about the scope of the schema elements and types and their use in other schemas particularly in reference to extensibility and modularity.

Since every distinct type and element has a single global definition, these canonical particles/components can be related one-to-one to identifiers in a database. And while it may at first glance seem like a tiresome ongoing manual task to maintain the associations between the textual XSD particles/components and the database, SQL Server 2005 can in fact generate canonical schema component identifiers via the statement

CREATE XML SCHEMA COLLECTION

http://technet.microsoft.com/en-us/library/ms179457.aspx

Conversely, to construct a schema from the canonical particles, SQL Server 2005 provides the

SELECT xml_schema_namespace function

http://technet.microsoft.com/en-us/library/ms191170.aspx

ca·non·i·cal Related to Mathematics. (of an equation, coordinate, etc.) "in simplest or standard form" http://dictionary.reference.com/browse/canonical

Other, easier to construct, but less resuable/more "denormalized/redundant" schema patterns include

• The Russian Doll approach http://blogs.msdn.com/skaufman/archive/2005/04/21/410486.aspx

The schema has one single global element - the root element. All other elements and types are nested progressively deeper giving it the name due to each type fitting into the one above it. Since the elements in this design are declared locally they will not be reusable through the import or include statements.

• The the Salami Slice approach http://blogs.msdn.com/skaufman/archive/2005/04/25/411809.aspx

All elements are defined globally but the type definitions are defined locally. This way other schemas may reuse the elements. With this approach, a global element with its locally defined type provide a complete description of the elements content. This information 'slice' is declared individually and then aggregated back together and may also be pieced together to construct other schemas.

• The Venetian Blind approach http://blogs.msdn.com/skaufman/archive/2005/04/29/413491.aspx

Similar to the Russian Doll approach in that they both use a single global element. The Venetian Blind approach describes a modular approach by naming and defining all type definitions globally (as opposed to the Salami Slice approach which declares elements globally and types locally). Each globally defined type describes an individual "slat" and can be reused by other components. In addition, all the locally declared elements can be namespace qualified or namespace unqualified (the slats can be "opened" or "closed") depending on the elementFormDefault attribute setting at the top of the schema.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow