Schematron validation "errors" enhancement

Hi, Currently, Schematron is regarded by oXygen as a validation technology, and it uses many of the same interfaces as XSD, RNG and DTD, including having its results reported in a validation results window. This is fine, but additionally, a distinction is made between Schematron messages emitted by Schematron 'assert' elements and those emitted by 'report' elements; the latter are classed as warnings, not errors. They are formatted differently, with a yellow icon instead of a red one, plus the word "warning". However, in my experience working with Schematron, more often than not this distinction does not hold. "Errors" or "warnings" or simple "alerts" or messages of any status whatever can be the results of either a Schematron 'assert' or 'report'; that is, which Schematron element is used to generate a message has nothing to do with the severity of the condition being reported. Or even whether it's good or bad: sometimes the message emitted by either an 'assert' or 'report' represents not failure but success. I wonder if in oXygen, this distinction could be removed, and the results of all Schematron assertions (both 'assert' and 'report', i.e "positive and negative assertions of a constraint") could be represented the same. If you really wanted to get fancy, the status of a Schematron message in oXygen might be configurable. Maybe the assignment of "error" or "warning" (or whatever else: blue or green icons?) could be made on the basis of a regular expression matching the message. It's pretty common practice for Schematron messages to be internally structured with their own language about errors, warnings, alerts, info, etc., with structurer error codes and the like. The same thing applies to the inline iconography, i.e., having errors from 'assert' underlined in red in the editor view, while 'report' results are colored yellow. For Schematron developers who are deploying oXygen as a validation platform for non-experts -- who are sometimes prone to be alarmed unnecessarily by iconography -- it would be really nice to be able to control this. If not, I think oXygen might at least be neutral on the question of which Schematron messages are at which level of severity. What do you think? Cheers, Wendell ====================================================================== Wendell Piez mailto:wapiez@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================

Wendell, I agree with you about oxygen's current usage as being an interpretation that is over and above what the Schematron spec actually says. I have to admit that I really like being able to distinguish between errors and warnings, so I do like the fact that oxygen has foreseen a need for this. In addition to the approach of embedding severity information using some convention inside error messages or diagnostics, I can think of a couple of other ways to approach this. (1) Associate error severities with different phases (<sch;phase>). I'm not sure how you pass phase requests to the schematron processor inside oxygen (George?), but this strikes me as a really nice way. You could then run the same schematron file against the instance multiple times invoking one phase for warnings, another phase for errors, etc. You might associate a particular phase to a particular kind of display in oxygen up in the <?oxygen... processing instruction that calls the schematron. So say you have a schematron schema that defines two phases, "warning-phase" and "error-phase". Then you'd associate it twice with your instance (maybe with the help of some new options in the Associate Schema dialog tab for schematron), ending up with two PI's attached to your instance, that might look like this: <?oxygen SCHSchema="example.sch" phase="warning-phase" display-as="warning"/> <?oxygen SCHSchema="example.sch" phase="error-phase" display-as="error"/> (2) For ISO Schematron, another approach might be to support "flag" attributes on assertions and rules. The ISO schematron spec states: "The purpose of flags is to convey state or severity information to a subsequent process." so it's my reading that a distinction such as between warning and error was one (of many other kinds) of thing that flag attributes can address. You'd have to have some way in oxygen of stating that a particular flag-name was intended to be associated with errors and another with warnings. I personally like phases better than flags for this. Just some thoughts. John On Nov 25, 2009, at 2:47 PM, Wendell Piez wrote:
Hi,
Currently, Schematron is regarded by oXygen as a validation technology, and it uses many of the same interfaces as XSD, RNG and DTD, including having its results reported in a validation results window.
This is fine, but additionally, a distinction is made between Schematron messages emitted by Schematron 'assert' elements and those emitted by 'report' elements; the latter are classed as warnings, not errors. They are formatted differently, with a yellow icon instead of a red one, plus the word "warning".
However, in my experience working with Schematron, more often than not this distinction does not hold. "Errors" or "warnings" or simple "alerts" or messages of any status whatever can be the results of either a Schematron 'assert' or 'report'; that is, which Schematron element is used to generate a message has nothing to do with the severity of the condition being reported. Or even whether it's good or bad: sometimes the message emitted by either an 'assert' or 'report' represents not failure but success.
I wonder if in oXygen, this distinction could be removed, and the results of all Schematron assertions (both 'assert' and 'report', i.e "positive and negative assertions of a constraint") could be represented the same.
If you really wanted to get fancy, the status of a Schematron message in oXygen might be configurable. Maybe the assignment of "error" or "warning" (or whatever else: blue or green icons?) could be made on the basis of a regular expression matching the message. It's pretty common practice for Schematron messages to be internally structured with their own language about errors, warnings, alerts, info, etc., with structurer error codes and the like.
The same thing applies to the inline iconography, i.e., having errors from 'assert' underlined in red in the editor view, while 'report' results are colored yellow.
For Schematron developers who are deploying oXygen as a validation platform for non-experts -- who are sometimes prone to be alarmed unnecessarily by iconography -- it would be really nice to be able to control this.
If not, I think oXygen might at least be neutral on the question of which Schematron messages are at which level of severity.
What do you think?
Cheers, Wendell
====================================================================== Wendell Piez mailto:wapiez@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user

I agree completely w/ Wendell's overall idea: oXygen should give me, the Schematron author, some control over whether or not <report> and <assert> are treated the same or not. I wouldn't even mind if that control were shifted to the person running oXygen: i.e., a preference check box "treat Schematron <report>s as warnings (instead of errors)". As far as John's two recommended methods, I can't comment authoritatively, as I've never really used phases or flags. But my reading of DSDL is that a flag= would not be correct. A boolean variable with initial value false. A flag is implicitly declared by an assertion or rule having a flag attribute with that name. The value of a flag becomes true when an assertion with that flag fails or a rule with that flag fires. The purpose of flags is to convey state or severity information to a subsequent process. An implementation is not required to make use of this attribute. -- ISO/IEC FDIS 19757-3 5.5.5 So I'm inclined to think role= is more appropriate for this purpose. (And a post of R. Jelliffe I just found seems to back this up.) Although I'm not at all sure it's the right thing.

Hi Wendell, John, Syd, I changed the Schematron support as follows: * I removed the default marking of reports as warnings * to determine the severity level of a message we look for, in order: 1. the role attribute If the value matches (case insensitive) "warn" or "warning" -- we set the level to warning "error" -- we set the level to error "fatal" -- we set the level to fatal "info" or "information" -- we set the level to info 2. the start of the message after trimming whitespaces If the message starts with (case sensitive) "Warning:" -- we set the level to warning "Error:" -- we set the level to error "Fatal:" -- we set the level to fatal "Info:" -- we set the level to info The matched prefix is removed from the message. 3. we use the error level as default/fallback. I cannot see an easy way to implement a phase based approach. oXygen determines the phases and pops up a phase chooser dialog when the document is validated for the first time or on "Reset cache and validate" action. All these will be available in oXygen 11.1. If you need access to that before 11.1 just let me know. Best Regards, George -- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com

George et al., Looking again at the spec and the schematron list, I now entirely agree with Syd & George that role (not flag) looks like the correct attribute to work with. George, this looks superb. I'm going to try it out and give more feedback. John On Nov 26, 2009, at 11:05 AM, George Cristian Bina wrote:
Hi Wendell, John, Syd,
I changed the Schematron support as follows:
* I removed the default marking of reports as warnings * to determine the severity level of a message we look for, in order: 1. the role attribute If the value matches (case insensitive) "warn" or "warning" -- we set the level to warning "error" -- we set the level to error "fatal" -- we set the level to fatal "info" or "information" -- we set the level to info
2. the start of the message after trimming whitespaces If the message starts with (case sensitive) "Warning:" -- we set the level to warning "Error:" -- we set the level to error "Fatal:" -- we set the level to fatal "Info:" -- we set the level to info The matched prefix is removed from the message.
3. we use the error level as default/fallback.
I cannot see an easy way to implement a phase based approach. oXygen determines the phases and pops up a phase chooser dialog when the document is validated for the first time or on "Reset cache and validate" action.
All these will be available in oXygen 11.1. If you need access to that before 11.1 just let me know.
Best Regards, George -- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user

Hi, At 11:05 AM 11/26/2009, George wrote:
Hi Wendell, John, Syd,
I changed the Schematron support as follows....
Excellent. As I understand it, @flag allows us to declare flags with arbitrary names that would switching to true() when an assertion "succeeded" (an 'assert' came back false or a 'report' came back true). How oXygen might use this rather depends on how SVRL reports it (I guess), which I haven't looked into. It's also not clear to me whether the design limits a given assertion to a single flag, or whether (for example) a space-delimited list of flags is acceptable. The language in the spec regarding @role is murkier. Without an example (or maybe that post of Rick's mentioned by Syd) I'm not sure how that should work. On the other hand, I'm also willing to take everyone's word for it. I agree that what George has implemented is an excellent first cut. It especially has the virtue of working in the background, without special configuration. George, which error level is it that we will fall back to? Cheers, Wendell
* I removed the default marking of reports as warnings * to determine the severity level of a message we look for, in order: 1. the role attribute If the value matches (case insensitive) "warn" or "warning" -- we set the level to warning "error" -- we set the level to error "fatal" -- we set the level to fatal "info" or "information" -- we set the level to info
2. the start of the message after trimming whitespaces If the message starts with (case sensitive) "Warning:" -- we set the level to warning "Error:" -- we set the level to error "Fatal:" -- we set the level to fatal "Info:" -- we set the level to info The matched prefix is removed from the message.
3. we use the error level as default/fallback.
I cannot see an easy way to implement a phase based approach. oXygen determines the phases and pops up a phase chooser dialog when the document is validated for the first time or on "Reset cache and validate" action.
All these will be available in oXygen 11.1. If you need access to that before 11.1 just let me know.
====================================================================== Wendell Piez mailto:wapiez@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================

Hi Wendell, The default level is "error". Best Regards, George -- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com Wendell Piez wrote:
Hi,
At 11:05 AM 11/26/2009, George wrote:
Hi Wendell, John, Syd,
I changed the Schematron support as follows....
Excellent.
As I understand it, @flag allows us to declare flags with arbitrary names that would switching to true() when an assertion "succeeded" (an 'assert' came back false or a 'report' came back true). How oXygen might use this rather depends on how SVRL reports it (I guess), which I haven't looked into. It's also not clear to me whether the design limits a given assertion to a single flag, or whether (for example) a space-delimited list of flags is acceptable.
The language in the spec regarding @role is murkier. Without an example (or maybe that post of Rick's mentioned by Syd) I'm not sure how that should work. On the other hand, I'm also willing to take everyone's word for it.
I agree that what George has implemented is an excellent first cut. It especially has the virtue of working in the background, without special configuration.
George, which error level is it that we will fall back to?
Cheers, Wendell
* I removed the default marking of reports as warnings * to determine the severity level of a message we look for, in order: 1. the role attribute If the value matches (case insensitive) "warn" or "warning" -- we set the level to warning "error" -- we set the level to error "fatal" -- we set the level to fatal "info" or "information" -- we set the level to info
2. the start of the message after trimming whitespaces If the message starts with (case sensitive) "Warning:" -- we set the level to warning "Error:" -- we set the level to error "Fatal:" -- we set the level to fatal "Info:" -- we set the level to info The matched prefix is removed from the message.
3. we use the error level as default/fallback.
I cannot see an easy way to implement a phase based approach. oXygen determines the phases and pops up a phase chooser dialog when the document is validated for the first time or on "Reset cache and validate" action.
All these will be available in oXygen 11.1. If you need access to that before 11.1 just let me know.
====================================================================== Wendell Piez mailto:wapiez@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user

George, At 03:02 AM 12/1/2009, you wrote:
The default level is "error".
"error" for all cases -- that's good. Cheers, Wendell ====================================================================== Wendell Piez mailto:wapiez@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================

Hi George, Thanks for letting me try a snapshot with the new Schematron enhancement. It works great for me. I've tried both the role attribute and the text keyworkd mechanism. Observations: (1) For the role attribute mechanism, the documentation should mention that although @role is allowed on <rule> elements, its presence there does not reset any validation parameters in this implementation. It's only when @role occurs on a <report> or <assert> element. (I tripped up on this one). (1a) Alternatively, you could consider a convention where making @role on <rule> elements reset the default error level for all child <report> or <assert> element, i.e. make it so that any child <report> or <assert> elements that lack an explicit @role value will inherit the value from the parent <rule>. This makes some sense, but may be a stretch. What do people think? (2) The icon for "fatal" errors is the same as for "error" errors. Would be nice to differentiate these icons. (3) Very clever how you have a separate "Info" pane in the validation view that lists schematron outputs at the "info" level. Nice touch. John

Hi, Since John has asked for thoughts: At 09:43 AM 12/2/2009, he wrote:
(1a) Alternatively, you could consider a convention where making @role on <rule> elements reset the default error level for all child <report> or <assert> element, i.e. make it so that any child <report> or <assert> elements that lack an explicit @role value will inherit the value from the parent <rule>. This makes some sense, but may be a stretch. What do people think?
This would be good, if it's not too much of a bother, and assuming it's still consistent with the intended semantics of @role.
(3) Very clever how you have a separate "Info" pane in the validation view that lists schematron outputs at the "info" level. Nice touch.
That does sound nice. I'll be trying the new features myself pretty soon, but I haven't yet -- it's not yet a blocker in my project. (I'll let George know if it becomes one before 11.1's official date.) Cheers, Wendell ====================================================================== Wendell Piez mailto:wapiez@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================

Hi Wendell,
(1a) Alternatively, you could consider a convention where making @role on <rule> elements reset the default error level for all child <report> or <assert> element, i.e. make it so that any child <report> or <assert> elements that lack an explicit @role value will inherit the value from the parent <rule>. This makes some sense, but may be a stretch. What do people think?
This would be good, if it's not too much of a bother, and assuming it's still consistent with the intended semantics of @role.
Wendell's right, this needs to be "within scope" for the semantics of @role. Best way to be sure is to ask Rick Jelliffe. I'll send up a message to the Schematron list and see if I get a response. I'll let y'all know. John

Hi John, @role can appear only on assert, report and rule. The spec describe it only in relation with assert and report so I do not see any problem assuming a value on rule as a default for enclosed assertions and reports. I changed the implementation to decide the role value looking at the @role on assert or report and if that is not specified at the @role on rule. That means the error level will be computed following the steps below: 1. Determine a role attribute from the assert/@role, respectively report/@role if specified, otherwise from the parent rule/@role. 2. If the role attribute matches one of the recognized values use that. 3. If the message starts with one of the recognized values use that. 4. Use "error" as error level. It is not a common practice to have different markers (icons) for errors and fatal errors. Note that next to the marker there is an indication of the severity, F for fatal errors and E for errors. Best Regards, George -- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com John Madden wrote:
Hi Wendell,
(1a) Alternatively, you could consider a convention where making @role on <rule> elements reset the default error level for all child <report> or <assert> element, i.e. make it so that any child <report> or <assert> elements that lack an explicit @role value will inherit the value from the parent <rule>. This makes some sense, but may be a stretch. What do people think?
This would be good, if it's not too much of a bother, and assuming it's still consistent with the intended semantics of @role.
Wendell's right, this needs to be "within scope" for the semantics of @role. Best way to be sure is to ask Rick Jelliffe. I'll send up a message to the Schematron list and see if I get a response. I'll let y'all know.
John
------------------------------------------------------------------------
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user

George, I think that's a very, very nice implementation. John On Dec 9, 2009, at 8:30 AM, George Cristian Bina wrote:
Hi John,
@role can appear only on assert, report and rule. The spec describe it only in relation with assert and report so I do not see any problem assuming a value on rule as a default for enclosed assertions and reports.
I changed the implementation to decide the role value looking at the @role on assert or report and if that is not specified at the @role on rule.
That means the error level will be computed following the steps below: 1. Determine a role attribute from the assert/@role, respectively report/@role if specified, otherwise from the parent rule/@role. 2. If the role attribute matches one of the recognized values use that. 3. If the message starts with one of the recognized values use that. 4. Use "error" as error level.
It is not a common practice to have different markers (icons) for errors and fatal errors. Note that next to the marker there is an indication of the severity, F for fatal errors and E for errors.
Best Regards, George -- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
John Madden wrote:
Hi Wendell,
(1a) Alternatively, you could consider a convention where making @role on <rule> elements reset the default error level for all child <report> or <assert> element, i.e. make it so that any child <report> or <assert> elements that lack an explicit @role value will inherit the value from the parent <rule>. This makes some sense, but may be a stretch. What do people think?
This would be good, if it's not too much of a bother, and assuming it's still consistent with the intended semantics of @role. Wendell's right, this needs to be "within scope" for the semantics of @role. Best way to be sure is to ask Rick Jelliffe. I'll send up a message to the Schematron list and see if I get a response. I'll let y'all know. John
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user
participants (4)
-
George Cristian Bina
-
John Madden
-
Syd Bauman
-
Wendell Piez