Automatically generating XML schemas using XSLT

Hello, Following this tutorial (http://www.liquid-technologies.com/Tutorials/XmlSchemas/XsdTutorial_04.aspx -- don't worry, I use Oxygen ;-) ) I've refactored my 733 line schema into 16 separate files or sub-schemas, each with its own namespace. Now the top level schema is just 77 lines. The plan is to use these sub-schema's to build other top-level schemas. The problem is that most top level schema's are quite similar and only differ in a few low level details. For example while one top level schema supports all PaymentMethodType's (see tutorial) another top-level schema may support only VISA and MasterCard. Currently my method of creating top-level schemas involves considerable duplication. For example, my current method of creating a top-level schema in which only VISA and MasterCard are supported would involve duplicating Main.xsd and OrderType.xsd, but customizing CommonTypes.xsd and reuse CustomerTypes.xsd. (As my actual schema is a lot longer a lot more duplication is involved.) I find this duplication unacceptable primarily because it introduces a maintenance challenge i.e. I would have to maintain any number of identical sub-schemas with different names. What I would like to know is if there is a method of automatically generating a schema without via a config file of some sort (XSLT perhaps?), in order to avoid duplicating sub-schemas. Also, is it good practice for all the sub-schemas to declare the same target namespace in this case (just like the xml schema namespace xs, but have custom sub-schemas declare a separate namespace? - Olumide

I have the same kind of issue in the SPFE Open Toolkit. Schemas are highly modular, and sometimes the details you want in a particular low level schema depend on what you are trying to achieve in the higher level schema. To accomplish this without duplication, I use groups. Essentially, the trick is this: 1. Place reusable or variable elements in groups in the lower-level schemas. 2. In the high level schemas, define high-level groups containing whatever groups (defined in the low level schemas) that you want used throughout your resulting schema. 3. In the low level schemas, use the high-level groups to encapsulate variations that depend on which high level schema the low level schema is being included in. In other words, use groups in the high-level schemas to determine what features are turned on in the lower-level schemas. That explanation may be a bit hard to parse, but you can see an example of the technique at work in the SPFE Open Toolkit on GitHub: https://github.com/mbakeranalecta/spfe-open-toolkit/blob/master/spfe-docs/sc hemas/authoring/roots/element-descriptions.xsd Mark
-----Original Message----- From: oxygen-user-bounces@oxygenxml.com [mailto:oxygen-user- bounces@oxygenxml.com] On Behalf Of Olumide Sent: November 21, 2012 1:27 PM To: oxygen-user@oxygenxml.com Subject: [oXygen-user] Automatically generating XML schemas using XSLT
Hello,
Following this tutorial (http://www.liquid- technologies.com/Tutorials/XmlSchemas/XsdTutorial_04.aspx -- don't worry, I use Oxygen ;-) ) I've refactored my 733 line schema into 16 separate files or sub-schemas, each with its own namespace. Now the top level schema is just 77 lines. The plan is to use these sub-schema's to build other top-level schemas.
The problem is that most top level schema's are quite similar and only differ in a few low level details. For example while one top level schema supports all PaymentMethodType's (see tutorial) another top-level schema may support only VISA and MasterCard. Currently my method of creating top-level schemas involves considerable duplication. For example, my current method of creating a top-level schema in which only VISA and MasterCard are supported would involve duplicating Main.xsd and OrderType.xsd, but customizing CommonTypes.xsd and reuse CustomerTypes.xsd. (As my actual schema is a lot longer a lot more duplication is involved.)
I find this duplication unacceptable primarily because it introduces a maintenance challenge i.e. I would have to maintain any number of identical sub-schemas with different names.
What I would like to know is if there is a method of automatically generating a schema without via a config file of some sort (XSLT perhaps?), in order to avoid duplicating sub-schemas.
Also, is it good practice for all the sub-schemas to declare the same target namespace in this case (just like the xml schema namespace xs, but have custom sub-schemas declare a separate namespace?
- Olumide
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user

On 21/11/2012 19:12, Mark Baker wrote:
I have the same kind of issue in the SPFE Open Toolkit. Schemas are highly modular, and sometimes the details you want in a particular low level schema depend on what you are trying to achieve in the higher level schema. To accomplish this without duplication, I use groups. Essentially, the trick is this:
1. Place reusable or variable elements in groups in the lower-level schemas.
2. In the high level schemas, define high-level groups containing whatever groups (defined in the low level schemas) that you want used throughout your resulting schema. 3. In the low level schemas, use the high-level groups to encapsulate variations that depend on which high level schema the low level schema is being included in.
In other words, use groups in the high-level schemas to determine what features are turned on in the lower-level schemas.
It ... sort ... of makes sense. I'm going to have to set aside the book on XSLT and take a really close look. Or could you kindly highlight where in the schema you perform each one of these steps? BTW, you don't seem to have split your schema into sub-schemas. Inspired by your approach, it would be nice to have the varying elements share a common name but live in unique namespaces. For example the general version of the element Foo in the common namespace com and thus referenced as com:Foo, while the customized version of Foo be declared another namespace cus, and thus referenced as cus:Foo. The goal of course would be to find a way of identifying the appropriate Foo namespace in the top-level schema. Note that Foo may not appear in the top level schema, and is often deeply nested in other elements contained in the top level level schema. - Olumide

-----Original Message----- From: Olumide [mailto:videohead@mail.com] Sent: November 21, 2012 2:45 PM To: Mark Baker; oxygen-user@oxygenxml.com Subject: Re: [oXygen-user] Automatically generating XML schemas using XSLT
On 21/11/2012 19:12, Mark Baker wrote:
I have the same kind of issue in the SPFE Open Toolkit. Schemas are highly modular, and sometimes the details you want in a particular low level schema depend on what you are trying to achieve in the higher level schema. To accomplish this without duplication, I use groups. Essentially, the
Olumide, Briefly (I can supply more detail later if you want). The schemas are split into sub-schemas. The example I pointed to is just one of the sub-schemas. The way I have it organized is this: * the top level schema is a wrapper that consists only of include statements. Since I am using a chameleon schema approach, this top level schema also establishes the namespace for the whole schema, but this would not apply to you if you are using separate namespaces for each component. The top level schema in this case has the same name but is one level up: https://github.com/mbakeranalecta/spfe-open-toolkit/blob/master/spfe-docs/sc hemas/authoring/element-descriptions.xsd . * I have a set of schema modules in a modules directory (or, in this case, more than one module directory). The wrapper schema includes whichever modules are required for the schema being defined. * I have a set of root schemas in a roots directory. The example I pointed to is one of these. The roots schemas define the root element and unique structures of a particular schema type, and include the definitions of the top-level groups. Those top level groups are defined by including groups defined in the modular schemas. For example, line 91 defines the p-content group: <xs:group name="p-content"> <xs:choice> <xs:group ref="text-decoration"/> <xs:group ref="mentions-general"/> <xs:group ref="mentions-xml"/> <xs:group ref="mentions-spfe-build"/> <xs:group ref="resources"/> <xs:group ref="substitutions"/> </xs:choice> </xs:group> The p-content group defines the elements that are permitted in a p element for this schema. Each one of the referenced groups defines elements that can occur inside a paragraph. Each of these groups is defined in one of the module schemas. Any time any of the module schemas declares a paragraph, their declare its content to be the p-content group. That way, the elements that are allowed in a paragraph are not defined in the module schema, but in the root schema. For example, the paragraphs schema module defines the paragraph-type like this: <xs:complexType name="paragraph-type" mixed="true"> <xs:choice minOccurs="0" maxOccurs="unbounded"> <xs:group ref="p-content"/> </xs:choice> <xs:attributeGroup ref="conditions"/> </xs:complexType> The mentions-xml group is defined in https://github.com/mbakeranalecta/spfe-open-toolkit/blob/master/spfe-ot/plug ins/eppo-simple/schemas/authoring/modules/mentions/mentions-xml.xsd as: <xs:group name="mentions-xml"> <xs:choice> <xs:element name="xml-element-name"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="xpath" use="optional"/> <xs:attribute name="namespace-uri" type="xs:anyURI" use="optional"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name="xml-attribute-name"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="xpath" use="optional"/> <xs:attribute name="namespace-uri" type="xs:anyURI" use="optional"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name="xml-namespace-uri"/> <xs:element name="xpath"/> </xs:choice> </xs:group> This means that in any paragraph for the particular schema I am defining, no matter which module it was defined in, the tags <xml-element-name>, <xml-attribute-name>, and <xml-namespace-uri> are allowed to appear. If the same module that defined the paragraph were included in another schema that did not include the mentions-xml in its definition of the p-content group, then those elements would not be allowed inside that paragraph. Essentially, then, I am using groups as a transclusion mechanism to pull definitions from one schema file to another, which eliminates the need to repeat common structured from one schema module to another. Hope this helps, Mark trick is
this:
1. Place reusable or variable elements in groups in the lower-level schemas.
2. In the high level schemas, define high-level groups containing whatever groups (defined in the low level schemas) that you want used throughout your resulting schema. 3. In the low level schemas, use the high-level groups to encapsulate variations that depend on which high level schema the low level schema is being included in.
In other words, use groups in the high-level schemas to determine what features are turned on in the lower-level schemas.
It ... sort ... of makes sense. I'm going to have to set aside the book on XSLT and take a really close look.
Or could you kindly highlight where in the schema you perform each one of these steps?
BTW, you don't seem to have split your schema into sub-schemas.
Inspired by your approach, it would be nice to have the varying elements share a common name but live in unique namespaces. For example the general version of the element Foo in the common namespace com and thus referenced as com:Foo, while the customized version of Foo be declared another namespace cus, and thus referenced as cus:Foo. The goal of course would be to find a way of identifying the appropriate Foo namespace in the top-level schema. Note that Foo may not appear in the top level schema, and is often deeply nested in other elements contained in the top level level schema.
- Olumide

On 21/11/2012 20:35, Mark Baker wrote:
Briefly (I can supply more detail later if you want).
Thanks Mark. Unfortunately I'm still getting lost in the details. I have however googled chameleon schema (never heard of it) and found this article http://www.xfront.com/ZeroOneOrManyNamespaces.html, which also introduced the <redefine> element. Now if only there was a way to redefine an inner (nested) element while importing a top level element. - Olumide

That's precisely why I don't use <redefine>. Using the group method does allow you to redefine the nested element while importing a top-level element. Imported schema: element "foo" complex-content group ref="foo-content" Importing schema B: include "imported.xsd" element "bar" complex-content sequence element ref="foo" group "foo-content" sequence element "baz" Importing schema A: include "imported.xsd" element "gruznatz" complex-content sequence element ref="foo" group "foo-content" sequence element "bonk" Now the valid content model for documents using schema A is /bar/foo/baz, and the valid content model for documents using schema B is /gruznats/foo/bonk. Thus element foo is reused in both schemas, but with a different content model. (Obviously this does not make sense if you change the entire content model, as this example does, but it makes perfect sense if you want to redefine part of the content model of foo depending on where you import it.) Mark
-----Original Message-----
From: Olumide [mailto:videohead@mail.com]
Sent: November 22, 2012 10:30 AM
To: Mark Baker; oxygen-user@oxygenxml.com
Subject: Re: [oXygen-user] Automatically generating XML schemas using
XSLT
On 21/11/2012 20:35, Mark Baker wrote:
Briefly (I can supply more detail later if you want).
Thanks Mark. Unfortunately I'm still getting lost in the details. I have
however googled chameleon schema (never heard of it) and found this
article <http://www.xfront.com/ZeroOneOrManyNamespaces.html> http://www.xfront.com/ZeroOneOrManyNamespaces.html, which
also
introduced the <redefine> element. Now if only there was a way to
redefine an inner (nested) element while importing a top level element.
- Olumide

On 22/11/2012 16:44, Mark Baker wrote:
That's precisely why I don't use <redefine>. Using the group method does allow you to redefine the nested element while importing a top-level element.
Imported schema:
element "foo" complex-content group ref="foo-content"
Importing schema B:
include "imported.xsd"
element "bar" complex-content sequence element ref="foo"
group "foo-content" sequence element "baz" ... Now the valid content model for documents using schema A is /bar/foo/baz, and the valid content model for documents using schema B is /gruznats/foo/bonk.
Thus element foo is reused in both schemas, but with a different content model.
Thanks Mark. Just a few matters that I'd like to clarify. Does this pattern apply when, 1. All schemas share a common, top-level schema which defines a top-level element and in turn imports lower-level schemas that define lower-level elements etc. 2. The element that I wish to "override" is not directly imported by the top-level schema. Regards - Olumide

The following question is identical to mine. http://lists.w3.org/Archives/Public/xmlschema-dev/2010Sep/0014.html Interestingly, someone recommends the XSD 1.1 facility xs:override - Olumide
participants (2)
-
Mark Baker
-
Olumide