wrong conjunction for multiple pattern facets?

[This may be a FAQ, in which case I apologize in advance. I have searched the list archives, but not the forums, for postings related to this issue.] It seems that oXygen's XML validator does not know how to properly apply multiple 'pattern' facets when validating against a RelaxNG grammar. To my knowledge, multiple occurrences of a 'pattern' facet are allowed on xsd: datatypes; the content in the instance should match all of the patterns specified in order to be considered valid. When several <param> elements are included, all the constraints must be met (in other words, the result is a logical "and" of all the conditions). Also note that the same facet can't be repeated twice except for the facet named 'pattern'. -- van der Vlist, Eric. _RELAX_NG_, Ch 8 sect. "Facets", p. 93 However, oXygen's internal validator (xerces, right?) seems to use "or" instead of "and". Here is a test. --------- begin t.rnc --------- datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes" start = element test { element alpha { xsd:token { pattern = "a{1,9}B{3}c" pattern = "a{3}B{1,9}c" maxLength = "25" } }+ } --------- end t.rnc --------- --------- begin t.xml --------- <?xml version="1.0" encoding="UTF-8"?> <?oxygen RNGSchema="file:/private/tmp/t.rnc" type="compact"?> <test> <alpha>aaaBBBc</alpha> <alpha>aBBBc</alpha> <alpha>aaaBc</alpha> </test> --------- end t.xml --------- I expect the above file to be invalid: line 6 fails to match the first xsd pattern in the schema and line 5 fails to match the second. I expect line 4 to be valid. I ran xmllint, jing, and rnv on the command-line, and they all flag lines 5 & 6 as invalid (i.e., they agree with me.) I don't know how to run xerces from the command-line. (Feel free to tell me ... :-) But oXygen says t.xml is valid (both the "live" validation that occurs while I type and the "static" validation that occurs with CMD-shift-V say it is valid).

Hi Syd, In XML Schema the matching should be done against any of the pattern facets specified for a specific type, see http://www.w3.org/TR/xmlschema-2/#src-multiple-patterns *** ·pattern· facets specified on the same step in a type derivation are ORed together *** oXygen 8 uses oNVDL for Relax NG validation, oNVDL changes Jing to add along with NVDL support also a couple of fixes, one of this fixes is to handle properly the pattern facet, that is why you obtain a different behavior if you use Jing http://www.oxygenxml.com/onvdl *** This version of oNVDL is based on Jing version 20030619 and changes Jing as follows: [...] Handles correctly the XML Schema pattern facet by checking against any of the patterns. *** Best Regards, George --------------------------------------------------------------------- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com Syd Bauman wrote:
[This may be a FAQ, in which case I apologize in advance. I have searched the list archives, but not the forums, for postings related to this issue.]
It seems that oXygen's XML validator does not know how to properly apply multiple 'pattern' facets when validating against a RelaxNG grammar.
To my knowledge, multiple occurrences of a 'pattern' facet are allowed on xsd: datatypes; the content in the instance should match all of the patterns specified in order to be considered valid. When several <param> elements are included, all the constraints must be met (in other words, the result is a logical "and" of all the conditions). Also note that the same facet can't be repeated twice except for the facet named 'pattern'. -- van der Vlist, Eric. _RELAX_NG_, Ch 8 sect. "Facets", p. 93
However, oXygen's internal validator (xerces, right?) seems to use "or" instead of "and". Here is a test.
--------- begin t.rnc --------- datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes" start = element test { element alpha { xsd:token { pattern = "a{1,9}B{3}c" pattern = "a{3}B{1,9}c" maxLength = "25" } }+ } --------- end t.rnc ---------
--------- begin t.xml --------- <?xml version="1.0" encoding="UTF-8"?> <?oxygen RNGSchema="file:/private/tmp/t.rnc" type="compact"?> <test> <alpha>aaaBBBc</alpha> <alpha>aBBBc</alpha> <alpha>aaaBc</alpha> </test> --------- end t.xml ---------
I expect the above file to be invalid: line 6 fails to match the first xsd pattern in the schema and line 5 fails to match the second. I expect line 4 to be valid.
I ran xmllint, jing, and rnv on the command-line, and they all flag lines 5 & 6 as invalid (i.e., they agree with me.) I don't know how to run xerces from the command-line. (Feel free to tell me ... :-)
But oXygen says t.xml is valid (both the "live" validation that occurs while I type and the "static" validation that occurs with CMD-shift-V say it is valid).
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user

Hi George; thanks for the quick and insightful reply. I just don't get it, though. Before I explain, though, let me say that I realize that the rest of this discussion may be out of scope for this list -- this is a problem that is not oXygen's fault, and exists whether or not oXygen does. I get to this problem because I use oXygen, but feel free to tell me it's time to move this to rng-users or some such. I am close the furthest thing there is from an expert on the W3C XML Schema language, and I had never heard of, let alone read, 4.3.4.3, before you pointed it out. But upon reading it now, I have to admit I don't quite understand a) what it means b) what it's got to do with RelaxNG validation The text of 4.3.4.3 says something that, on the face of it, seems silly. If multiple <pattern> element information items appear as [children] of a <simpleType>, the [value]s should be combined as if they appeared in a single regular expression as separate branches. First, I am under the (perhaps erroneous) impression that a <pattern> element can not be the child of a <simpleType> element. Second, the idea seems silly. If I wanted two regular expressions R1 and R2 to appear in a single regular expression as separate branches, I could have just written "R1|R2", no? So my gut instinct is that this rule isn't helpful. The note attached to 4.3.4.3 says ... pattern facets specified on the same step in a type derivation are ORed together, while pattern facets specified on different steps of a type derivation are ANDed together. but I have yet to figure out what a "step" is. Sigh. I suppose I should probably just go buy & read Eric van der Vlist's book on XSD and read it ... maybe someday. But more importantly, whatever XSD says about multiple patterns, RelaxNG seems pretty clear. The following is from section 2 of "Guidelines for using W3C XML Schema Datatypes with RELAX NG"[1] If the 'pattern' parameter is specified more than once for a single 'data' element, then a string matches the 'data' element only if it matches all of the patterns. Seems pretty clear to me. Note ---- [1] Which I found at http://relaxng.org/xsd-20010907.html; it is linked to from the main RelaxNG home page.

Hi Syd, Thanks for following on this issue. Here it is what happened: one of our users reported this as a bug. Asserting this as a bug was based on the fact that if you convert the Relax NG schema to XML Schema through TRANG you will get multiple pattern facets in XML Schema that as I quoted the spec will be ORed, so the result XML Schema will accept a value that matches any of the pattern constraints and I thought Relax NG just delegates to XML Schema for these facet constraints so it should behave similarly. I can very easily rollback those changes in oNVDL - in fact I will provide you a link to an updated version with these changes rolled back soon. Best Regards, George --------------------------------------------------------------------- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com Syd Bauman wrote:
Hi George; thanks for the quick and insightful reply. I just don't get it, though. Before I explain, though, let me say that I realize that the rest of this discussion may be out of scope for this list -- this is a problem that is not oXygen's fault, and exists whether or not oXygen does. I get to this problem because I use oXygen, but feel free to tell me it's time to move this to rng-users or some such.
I am close the furthest thing there is from an expert on the W3C XML Schema language, and I had never heard of, let alone read, 4.3.4.3, before you pointed it out. But upon reading it now, I have to admit I don't quite understand a) what it means b) what it's got to do with RelaxNG validation
The text of 4.3.4.3 says something that, on the face of it, seems silly.
If multiple <pattern> element information items appear as [children] of a <simpleType>, the [value]s should be combined as if they appeared in a single regular expression as separate branches.
First, I am under the (perhaps erroneous) impression that a <pattern> element can not be the child of a <simpleType> element.
Second, the idea seems silly. If I wanted two regular expressions R1 and R2 to appear in a single regular expression as separate branches, I could have just written "R1|R2", no? So my gut instinct is that this rule isn't helpful.
The note attached to 4.3.4.3 says
... pattern facets specified on the same step in a type derivation are ORed together, while pattern facets specified on different steps of a type derivation are ANDed together.
but I have yet to figure out what a "step" is. Sigh. I suppose I should probably just go buy & read Eric van der Vlist's book on XSD and read it ... maybe someday.
But more importantly, whatever XSD says about multiple patterns, RelaxNG seems pretty clear. The following is from section 2 of "Guidelines for using W3C XML Schema Datatypes with RELAX NG"[1]
If the 'pattern' parameter is specified more than once for a single 'data' element, then a string matches the 'data' element only if it matches all of the patterns.
Seems pretty clear to me.
Note ---- [1] Which I found at http://relaxng.org/xsd-20010907.html; it is linked to from the main RelaxNG home page.
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user

Hello Syd, Here it is an oNVDL distribution with the pattern changes rolled back, it checks against all the pattern constraints: http://www.oxygenxml.com/update/onvdl.jar Replace the oXygen lib/onvdl.jar with the above jar. For all: Happy New Year! George --------------------------------------------------------------------- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com George Cristian Bina wrote:
Hi Syd,
Thanks for following on this issue. Here it is what happened: one of our users reported this as a bug. Asserting this as a bug was based on the fact that if you convert the Relax NG schema to XML Schema through TRANG you will get multiple pattern facets in XML Schema that as I quoted the spec will be ORed, so the result XML Schema will accept a value that matches any of the pattern constraints and I thought Relax NG just delegates to XML Schema for these facet constraints so it should behave similarly. I can very easily rollback those changes in oNVDL - in fact I will provide you a link to an updated version with these changes rolled back soon.
Best Regards, George --------------------------------------------------------------------- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
Syd Bauman wrote:
Hi George; thanks for the quick and insightful reply. I just don't get it, though. Before I explain, though, let me say that I realize that the rest of this discussion may be out of scope for this list -- this is a problem that is not oXygen's fault, and exists whether or not oXygen does. I get to this problem because I use oXygen, but feel free to tell me it's time to move this to rng-users or some such.
I am close the furthest thing there is from an expert on the W3C XML Schema language, and I had never heard of, let alone read, 4.3.4.3, before you pointed it out. But upon reading it now, I have to admit I don't quite understand a) what it means b) what it's got to do with RelaxNG validation
The text of 4.3.4.3 says something that, on the face of it, seems silly. If multiple <pattern> element information items appear as [children] of a <simpleType>, the [value]s should be combined as if they appeared in a single regular expression as separate branches.
First, I am under the (perhaps erroneous) impression that a <pattern> element can not be the child of a <simpleType> element.
Second, the idea seems silly. If I wanted two regular expressions R1 and R2 to appear in a single regular expression as separate branches, I could have just written "R1|R2", no? So my gut instinct is that this rule isn't helpful.
The note attached to 4.3.4.3 says
... pattern facets specified on the same step in a type derivation are ORed together, while pattern facets specified on different steps of a type derivation are ANDed together.
but I have yet to figure out what a "step" is. Sigh. I suppose I should probably just go buy & read Eric van der Vlist's book on XSD and read it ... maybe someday.
But more importantly, whatever XSD says about multiple patterns, RelaxNG seems pretty clear. The following is from section 2 of "Guidelines for using W3C XML Schema Datatypes with RELAX NG"[1]
If the 'pattern' parameter is specified more than once for a single 'data' element, then a string matches the 'data' element only if it matches all of the patterns. Seems pretty clear to me.
Note ---- [1] Which I found at http://relaxng.org/xsd-20010907.html; it is linked to from the main RelaxNG home page.
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user

I'm moving this discussion over to the public-schemata-users list. See http://lists.w3.org/Archives/Public/public-schemata-users/.
participants (2)
-
George Cristian Bina
-
Syd Bauman