Validating entities?

This is possibly a silly question, please forgive me if that is the case: We have text files and RelaxNG schemas for them. In addition we point to DTD-fragments for defining entities. It seems that validation in Oxygen (both 12.x and 13) out-of-the-box does not check to see if entities used in the text file are declared or not, that the files are considered valid even if they use entity-names that are not used. A typical start-of-textfile may look like this: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE TEI [ <!ENTITY % HIS_entiteter SYSTEM 'http://www.edd.uio.no/ibsen/schema/ibsen-charent.dtd' > %HIS_entiteter; ]> <?xml-model href="http://www.edd.uio.no/ibsen/schema/tei_his.rnc" type="application/relax-ng-compact-syntax"?> <TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:HIS="http://www.example.org/ns/HIS"> <teiHeader xml:lang="nob"> <fileDesc> <titleStmt> <title level="s" type="main">Henrik Ibsens skrifter</title> <title level="s" type="sub">Diplomatarisk tekstarkiv</title> <title level="a" type="main">Peer Gynt</title> <title level="a" type="sub">NBO Ms.8° 894 (trykt eksemplar med rettelser)</title> <title level="a" type="origYear">[1874]</title> What can we do to have an error flagged if [ is not declared? Best regards, Espen Ore University of Oslo

Hi Espen, The TEI P5 document is validated against the RelaxNG schema but the Xerces parser is used to parse the XML file (and also the associated DTD infoset). By default the Xerces parser does not report unknown entity declarations when it is used to parse an XML file. To activate this behavior you can modify the XML declaration on top of the file to set the "standalone" flag like:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
If you do not specify the flag the parser will consider the file as being a possible module and thus does not report the unknown entity references as errors. The specs for the "standalone" attribute is here:
Regards, Radu Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com On 9/1/2011 9:48 AM, Espen S. Ore wrote:
This is possibly a silly question, please forgive me if that is the case:
We have text files and RelaxNG schemas for them. In addition we point to DTD-fragments for defining entities. It seems that validation in Oxygen (both 12.x and 13) out-of-the-box does not check to see if entities used in the text file are declared or not, that the files are considered valid even if they use entity-names that are not used. A typical start-of-textfile may look like this:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE TEI [ <!ENTITY % HIS_entiteter SYSTEM 'http://www.edd.uio.no/ibsen/schema/ibsen-charent.dtd'> %HIS_entiteter; ]> <?xml-model href="http://www.edd.uio.no/ibsen/schema/tei_his.rnc" type="application/relax-ng-compact-syntax"?> <TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:HIS="http://www.example.org/ns/HIS"> <teiHeader xml:lang="nob"> <fileDesc> <titleStmt> <title level="s" type="main">Henrik Ibsens skrifter</title> <title level="s" type="sub">Diplomatarisk tekstarkiv</title> <title level="a" type="main">Peer Gynt</title> <title level="a" type="sub">NBO Ms.8° 894 (trykt eksemplar med rettelser)</title> <title level="a" type="origYear">[1874]</title>
What can we do to have an error flagged if[ is not declared?
Best regards, Espen Ore University of Oslo _______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user

Thank you very much! Espen Den 01. sep. 2011 10:00, skrev Radu Coravu:
Hi Espen,
The TEI P5 document is validated against the RelaxNG schema but the Xerces parser is used to parse the XML file (and also the associated DTD infoset).
By default the Xerces parser does not report unknown entity declarations when it is used to parse an XML file. To activate this behavior you can modify the XML declaration on top of the file to set the "standalone" flag like:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
If you do not specify the flag the parser will consider the file as being a possible module and thus does not report the unknown entity references as errors.
The specs for the "standalone" attribute is here:
Regards, Radu
Radu Coravu <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
On 9/1/2011 9:48 AM, Espen S. Ore wrote:
This is possibly a silly question, please forgive me if that is the case:
We have text files and RelaxNG schemas for them. In addition we point to DTD-fragments for defining entities. It seems that validation in Oxygen (both 12.x and 13) out-of-the-box does not check to see if entities used in the text file are declared or not, that the files are considered valid even if they use entity-names that are not used. A typical start-of-textfile may look like this:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE TEI [ <!ENTITY % HIS_entiteter SYSTEM 'http://www.edd.uio.no/ibsen/schema/ibsen-charent.dtd'> %HIS_entiteter; ]> <?xml-model href="http://www.edd.uio.no/ibsen/schema/tei_his.rnc" type="application/relax-ng-compact-syntax"?> <TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:HIS="http://www.example.org/ns/HIS"> <teiHeader xml:lang="nob"> <fileDesc> <titleStmt> <title level="s" type="main">Henrik Ibsens skrifter</title> <title level="s" type="sub">Diplomatarisk tekstarkiv</title> <title level="a" type="main">Peer Gynt</title> <title level="a" type="sub">NBO Ms.8° 894 (trykt eksemplar med rettelser)</title> <title level="a" type="origYear">[1874]</title>
What can we do to have an error flagged if[ is not declared?
Best regards, Espen Ore University of Oslo _______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com http://www.oxygenxml.com/mailman/listinfo/oxygen-user

I was about to report this as a bug before I realized it is connected with a preference setting. Running oXygen 13.0, with expanded Java memory settings, the following query returns a complete result: element out { for $i at $pos in 1 to 2000000 return <el>{$pos}</el> } but this query truncates after 600K results: for $i at $pos in 1 to 2000000 return <el>{$pos}</el> The truncation depends on the Preference setting for XQuery - "Size limit of Sequence view (MB)". That is understandable. However, it would be a help to the user if there was something more than just a "Transformation successful" message. For example, an error diagnostic saying "Memory limit 2 MB - results truncated", maybe with a link to the preference setting where you can increase the limit. David -- David Sewell, Editorial and Technical Manager ROTUNDA, The University of Virginia Press PO Box 400314, Charlottesville, VA 22904-4314 USA Email: dsewell@virginia.edu Tel: +1 434 924 9973 Web: http://rotunda.upress.virginia.edu/

Hello David, Thank you for letting us know about this. I've logged this to our issue tracking tool and we will implement this(a warning and tips when results are truncated) in a future version of Oxygen. Regards, Adrian Adrian Buza oXygen XML Editor and Author Support support@oxygenxml.com David Sewell wrote:
I was about to report this as a bug before I realized it is connected with a preference setting.
Running oXygen 13.0, with expanded Java memory settings, the following query returns a complete result:
element out { for $i at $pos in 1 to 2000000 return <el>{$pos}</el> }
but this query truncates after 600K results:
for $i at $pos in 1 to 2000000 return <el>{$pos}</el>
The truncation depends on the Preference setting for XQuery - "Size limit of Sequence view (MB)". That is understandable. However, it would be a help to the user if there was something more than just a "Transformation successful" message. For example, an error diagnostic saying "Memory limit 2 MB - results truncated", maybe with a link to the preference setting where you can increase the limit.
David
participants (4)
-
Adrian Buza
-
David Sewell
-
Espen S. Ore
-
Radu Coravu