How to include smaller files into a "master document" in Oxygen?

Greetings! I am currently editing several XML files which are the chapters of a Tamil thesaurus, and each file is dangerously close to the size limit connected with the script specificity. ("support for RTL languages" has to be activated, for proper display, and I have already increased the default size beyond which RTL language support is automatically deactivated) I would appreciate pointers on the best methods for dealing with the larger entity, i.e. the sum of the chapters. Can I create a MASTER document in Oxygen in which the chapters are INCLUDED (by mentionning their names) for the sake of processing the sum of the chapters? (I currently run XSLT transformations on separate chapters for making indices? How to do that for the sum of the chapters?) Thanks for any pointers to the appropriate section inside the Oxygen documentation (provided that it is possible to do that in Oxygen ...). If the "RTL language" limit was not there, I would have used a single file but that does not seem to be possible. -- Jean-Luc Chevillard (Paris-Pondicherry-Hamburg) "https://univ-paris-diderot.academia.edu/JeanLucChevillard"

Hello, What type of XML files are these (DITA, DocBook, TEI or custom)? The "support for RTL languages" is problematic for large files in Text mode. But Author mode (Document > Edit Mode > Author) can handle RTL content a lot better, so that's a possible solution if you want to work with larger files. However, if you have a custom type of document thhat Oxuygen doesn't support out-of-the-box, you'll have to create a custom CSS, so that the document is represented properly in Author mode: http://www.oxygenxml.com/doc/versions/17.1/ug-editor/index.html#concepts/dg-...
Can I create a MASTER document in Oxygen in which the chapters are INCLUDED (by mentionning their names) for the sake of processing the sum of the chapters? Yes, you can use XInclude to bind all documents together in a single master file. This way you can transform the master that includes all other documents as if it's a single document. http://www.oxygenxml.com/doc/versions/17.1/ug-editor/index.html#topics/inclu... The examples from this section are for DocBook, but XInclude is supported by Oxygen independent from the XML format. Check your XInclude options from Oxygen (Options > Preferences, XML > "XML Parser", "XInclude Options"), they should be enabled by default.
(I currently run XSLT transformations on separate chapters for making indices? How to do that for the sum of the chapters?) It's simpler to use the master file and runt the transformation just once on it.
Regards, Adrian Adrian Buza oXygen XML Editor and Author Support Tel: +1-650-352-1250 ext.2020 Fax: +40-251-461482 support@oxygenxml.com On 04.12.2015 15:40, Jean-Luc Chevillard wrote:
Greetings!
I am currently editing several XML files which are the chapters of a Tamil thesaurus, and each file is dangerously close to the size limit connected with the script specificity. ("support for RTL languages" has to be activated, for proper display, and I have already increased the default size beyond which RTL language support is automatically deactivated)
I would appreciate pointers on the best methods for dealing with the larger entity, i.e. the sum of the chapters.
Can I create a MASTER document in Oxygen in which the chapters are INCLUDED (by mentionning their names) for the sake of processing the sum of the chapters?
(I currently run XSLT transformations on separate chapters for making indices? How to do that for the sum of the chapters?)
Thanks for any pointers to the appropriate section inside the Oxygen documentation (provided that it is possible to do that in Oxygen ...).
If the "RTL language" limit was not there, I would have used a single file but that does not seem to be possible.
-- Jean-Luc Chevillard (Paris-Pondicherry-Hamburg)
"https://univ-paris-diderot.academia.edu/JeanLucChevillard"
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user

Hello Adrian, (resent, with copy to the mailing list) thanks for your explanations. My XML files are "custom" and I have defined a DTD and a CSS myself, which I am progressively enriching, as my understanding of the complexity of that particular thesaurus grows. In my creation of the Thesaurus I alternate between "Text Mode" for certain (complex) tasks and "Author Mode" for (easy) tasks. I had not realized until now that the RLT file size constraints are not that stringent in Author Mode. (BTW, Tamil is not a "RTL language" but rather a script with complex rendering ...) Since I asked the question, I have discovered the following video on the Oxygen Web site, "http://oxygenxml.com/demo/Working_With_XML_Modules.html" Using your answer and the video, I should (hopefully) have no difficulty trying to use that method. I have however a remaining question: -- I expect the linkage between the "master document" and the "included documents" to be described in some auxiliary file, which will tell Oxygen where to look for the DTD, etc. when it does "content completion" -- WHERE will those auxiliary files be located? (will they be in a hidden location) How can I do backup for them? -- WHAT HAPPENS when I upgrade to the next version of Oxygen? Will the auxiliary files be destroyed? Thanks for clarifying those points Best wishes -- Jean-Luc Chevillard (Paris) "https://univ-paris-diderot.academia.edu/JeanLucChevillard" "https://plus.google.com/u/0/113653379205101980081/posts/p/pub" "https://twitter.com/JLC1956" On 07/12/2015 15:32, Oxygen XML Editor Support (Adrian Buza) wrote:
Hello,
What type of XML files are these (DITA, DocBook, TEI or custom)?
The "support for RTL languages" is problematic for large files in Text mode. But Author mode (Document > Edit Mode > Author) can handle RTL content a lot better, so that's a possible solution if you want to work with larger files. However, if you have a custom type of document thhat Oxuygen doesn't support out-of-the-box, you'll have to create a custom CSS, so that the document is represented properly in Author mode: http://www.oxygenxml.com/doc/versions/17.1/ug-editor/index.html#concepts/dg-...
Can I create a MASTER document in Oxygen in which the chapters are INCLUDED (by mentionning their names) for the sake of processing the sum of the chapters? Yes, you can use XInclude to bind all documents together in a single master file. This way you can transform the master that includes all other documents as if it's a single document. http://www.oxygenxml.com/doc/versions/17.1/ug-editor/index.html#topics/inclu...
The examples from this section are for DocBook, but XInclude is supported by Oxygen independent from the XML format. Check your XInclude options from Oxygen (Options > Preferences, XML > "XML Parser", "XInclude Options"), they should be enabled by default.
(I currently run XSLT transformations on separate chapters for making indices? How to do that for the sum of the chapters?) It's simpler to use the master file and runt the transformation just once on it.
Regards, Adrian
Adrian Buza oXygen XML Editor and Author Support
Tel: +1-650-352-1250 ext.2020 Fax: +40-251-461482 support@oxygenxml.com
On 04.12.2015 15:40, Jean-Luc Chevillard wrote:
Greetings!
I am currently editing several XML files which are the chapters of a Tamil thesaurus, and each file is dangerously close to the size limit connected with the script specificity. ("support for RTL languages" has to be activated, for proper display, and I have already increased the default size beyond which RTL language support is automatically deactivated)
I would appreciate pointers on the best methods for dealing with the larger entity, i.e. the sum of the chapters.
Can I create a MASTER document in Oxygen in which the chapters are INCLUDED (by mentionning their names) for the sake of processing the sum of the chapters?
(I currently run XSLT transformations on separate chapters for making indices? How to do that for the sum of the chapters?)
Thanks for any pointers to the appropriate section inside the Oxygen documentation (provided that it is possible to do that in Oxygen ...).
If the "RTL language" limit was not there, I would have used a single file but that does not seem to be possible.
-- Jean-Luc Chevillard (Paris-Pondicherry-Hamburg)
"https://univ-paris-diderot.academia.edu/JeanLucChevillard"
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user

Hi, The video demonstration you found about working with XML modules uses external (system) entities to link the files together. This is a different, older method for working with XML modules. I would not recommend using it now, if you have the possibility of using XInclude instead. The link I gave you regarding XInclude describes it as a replacement for external entities and explains the advantages:
XInclude is targeted as the replacement for External Entities. The advantage of using XInclude is that, unlike the entities method, each of the assembled documents is permitted to contain a Document Type Declaration (DOCTYPE). This means that each file is a valid XML instance and can be independently validated. It also means that the main document to which smaller instances are included can be validated without having to remove or comment out the DOCTYPE. as is the case with External Entities. This makes XInclude a more convenient and effective method for managing XML instances that need to be stand-alone documents and part of a much larger project.
You can read more about the differences between external entities and XInclude here: http://www.xml.com/lpt/a/1009 Regarding your questions: - for XInclude the links between files are the XInclude statements found within the files. e.g. <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="introduction.xml"> - for external entities (the other method) there's an external (system) entity declaration in the DOCTYPE section of the XML document and an entity reference where the content is wanted. e.g. Declaration in the DOCTYPE <!ENTITY chapter1 SYSTEM "my/file/chapter1.xml"> And the reference in the XML content: &chapter1; So there are no "auxiliary files" for either of these two linking methods. Regards, Adrian On 07.12.2015 17:57, Jean-Luc Chevillard wrote:
Hello Adrian, (resent, with copy to the mailing list)
thanks for your explanations.
My XML files are "custom" and I have defined a DTD and a CSS myself, which I am progressively enriching, as my understanding of the complexity of that particular thesaurus grows.
In my creation of the Thesaurus I alternate between "Text Mode" for certain (complex) tasks and "Author Mode" for (easy) tasks.
I had not realized until now that the RLT file size constraints are not that stringent in Author Mode. (BTW, Tamil is not a "RTL language" but rather a script with complex rendering ...)
Since I asked the question, I have discovered the following video on the Oxygen Web site, "http://oxygenxml.com/demo/Working_With_XML_Modules.html"
Using your answer and the video, I should (hopefully) have no difficulty trying to use that method.
I have however a remaining question: -- I expect the linkage between the "master document" and the "included documents" to be described in some auxiliary file, which will tell Oxygen where to look for the DTD, etc. when it does "content completion" -- WHERE will those auxiliary files be located? (will they be in a hidden location) How can I do backup for them? -- WHAT HAPPENS when I upgrade to the next version of Oxygen? Will the auxiliary files be destroyed?
Thanks for clarifying those points
Best wishes
-- Jean-Luc Chevillard (Paris)
"https://univ-paris-diderot.academia.edu/JeanLucChevillard"
"https://plus.google.com/u/0/113653379205101980081/posts/p/pub"
On 07/12/2015 15:32, Oxygen XML Editor Support (Adrian Buza) wrote:
Hello,
What type of XML files are these (DITA, DocBook, TEI or custom)?
The "support for RTL languages" is problematic for large files in Text mode. But Author mode (Document > Edit Mode > Author) can handle RTL content a lot better, so that's a possible solution if you want to work with larger files. However, if you have a custom type of document thhat Oxuygen doesn't support out-of-the-box, you'll have to create a custom CSS, so that the document is represented properly in Author mode: http://www.oxygenxml.com/doc/versions/17.1/ug-editor/index.html#concepts/dg-...
Can I create a MASTER document in Oxygen in which the chapters are INCLUDED (by mentionning their names) for the sake of processing the sum of the chapters? Yes, you can use XInclude to bind all documents together in a single master file. This way you can transform the master that includes all other documents as if it's a single document. http://www.oxygenxml.com/doc/versions/17.1/ug-editor/index.html#topics/inclu...
The examples from this section are for DocBook, but XInclude is supported by Oxygen independent from the XML format. Check your XInclude options from Oxygen (Options > Preferences, XML > "XML Parser", "XInclude Options"), they should be enabled by default.
(I currently run XSLT transformations on separate chapters for making indices? How to do that for the sum of the chapters?) It's simpler to use the master file and runt the transformation just once on it.
Regards, Adrian
Adrian Buza oXygen XML Editor and Author Support
Tel: +1-650-352-1250 ext.2020 Fax: +40-251-461482 support@oxygenxml.com
On 04.12.2015 15:40, Jean-Luc Chevillard wrote:
Greetings!
I am currently editing several XML files which are the chapters of a Tamil thesaurus, and each file is dangerously close to the size limit connected with the script specificity. ("support for RTL languages" has to be activated, for proper display, and I have already increased the default size beyond which RTL language support is automatically deactivated)
I would appreciate pointers on the best methods for dealing with the larger entity, i.e. the sum of the chapters.
Can I create a MASTER document in Oxygen in which the chapters are INCLUDED (by mentionning their names) for the sake of processing the sum of the chapters?
(I currently run XSLT transformations on separate chapters for making indices? How to do that for the sum of the chapters?)
Thanks for any pointers to the appropriate section inside the Oxygen documentation (provided that it is possible to do that in Oxygen ...).
If the "RTL language" limit was not there, I would have used a single file but that does not seem to be possible.
-- Jean-Luc Chevillard (Paris-Pondicherry-Hamburg)
"https://univ-paris-diderot.academia.edu/JeanLucChevillard"
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
-- Adrian Buza oXygen XML Editor and Author Support Tel: +1-650-352-1250 ext.2020 Fax: +40-251-461482 support@oxygenxml.com

Hi Jean-Luc, I have a couple of comments regarding the XML part of your question (as opposed to the oXygen part). Some of this merely repeats what Adrian has said in more detail. Both external parsed entities and XInclude are mechanisms designed, at least putatively (in the case of parsed entities) for your use case. However, they are very different. In understanding why we have two mechanisms and what accounts for their differences, it helps to know that external parsed entities predate the "Dawn of XML", being a part of the SGML standard (ISO 8879:1986) of which XML is a refinement. This means they (at least for your use case) are somewhat like cooking your dinner in the fireplace. It works, and some systems (perhaps even fine restaurants) still use and rely on the mechanism; but some professional chefs have never seen it done and wonder why you would do it this way when you have a stove. The main difference between the mechanisms is that entity resolution takes place at "parse time", i.e. when a processor (a parser) reads XML markup (tags and text) and then does something with it. XInclude resolution postpones the assembly of the composite document until a processing step after parsing. That is, the various components are parsed separately, yielding several "XML documents" (considered as tree structures in memory, no longer tags-and-text) which can then be assembled typically as one step in a transformation or processing pipeline that does other stuff as well (such as generate formatted output). This is an important and useful distinction -- XInclude, in other words, takes advantage of modern architectures in which parsing is generic -- we don't configure separate parsing logic for every document, but instead use a commodity parser that produces a standardized result, which we can then process using XPath, XSLT etc. In particular, since you are validating using a DTD, the XInclude mechanism may work better for you, as (among other things) it means you can continue to validate the fragments, as fragments, against the DTD even before or without XInclude resolution that assembles the composite document. (Using oXygen's "master document" feature you could perhaps work around this limitation by always validating the composite document even when working with a fragment; but XInclude is nevertheless more flexible.) You asked about where the information is stored regarding how the master document and the included documents are related. This will typically be in the master file itself ... using entities, in declarations (indeed these are formally part of the DTD but they are commonly managed in an internal DTD subset in the document prolog as Adrian showed -- or search for "XML external parsed entities" for more examples). Using XInclude, however, you simply embed XInclude elements in your master document that present references to other files in your system. (I.e.: these are a form of hypertext link, to be resolved in processing.) <xi:include href="file.xml"/> This means, pretty simply, "Include the XML here from 'file.xml'." It can get more complex - there can be a fallback if file.xml is ever missing, plus you can actually include fragments from within file.xml, etc. etc. Using either mechanism, however, you will probably find things are pretty straightforward once you know about the knobs and switches -- until you hit the problem of internal cross-references within your assembly, which becomes somewhat more complex to manage as soon as your document is in several pieces. The bottom line here is that using XInclude, you can no longer rely on ID/IDREF attributes (declared in your DTD) to help ensure the integrity of your cross-references, since these attributes must now be able to point to elements across file boundaries. Of course, there are ways of dealing with this too.... Cheers, Wendell On Mon, Dec 7, 2015 at 10:57 AM, Jean-Luc Chevillard <jeanluc.chevillard@gmail.com> wrote:
Hello Adrian, (resent, with copy to the mailing list)
thanks for your explanations.
My XML files are "custom" and I have defined a DTD and a CSS myself, which I am progressively enriching, as my understanding of the complexity of that particular thesaurus grows.
In my creation of the Thesaurus I alternate between "Text Mode" for certain (complex) tasks and "Author Mode" for (easy) tasks.
I had not realized until now that the RLT file size constraints are not that stringent in Author Mode. (BTW, Tamil is not a "RTL language" but rather a script with complex rendering ...)
Since I asked the question, I have discovered the following video on the Oxygen Web site, "http://oxygenxml.com/demo/Working_With_XML_Modules.html"
Using your answer and the video, I should (hopefully) have no difficulty trying to use that method.
I have however a remaining question: -- I expect the linkage between the "master document" and the "included documents" to be described in some auxiliary file, which will tell Oxygen where to look for the DTD, etc. when it does "content completion" -- WHERE will those auxiliary files be located? (will they be in a hidden location) How can I do backup for them? -- WHAT HAPPENS when I upgrade to the next version of Oxygen? Will the auxiliary files be destroyed?
Thanks for clarifying those points
Best wishes
-- Jean-Luc Chevillard (Paris)
"https://univ-paris-diderot.academia.edu/JeanLucChevillard"
"https://plus.google.com/u/0/113653379205101980081/posts/p/pub"
On 07/12/2015 15:32, Oxygen XML Editor Support (Adrian Buza) wrote:
Hello,
What type of XML files are these (DITA, DocBook, TEI or custom)?
The "support for RTL languages" is problematic for large files in Text mode. But Author mode (Document > Edit Mode > Author) can handle RTL content a lot better, so that's a possible solution if you want to work with larger files. However, if you have a custom type of document thhat Oxuygen doesn't support out-of-the-box, you'll have to create a custom CSS, so that the document is represented properly in Author mode:
http://www.oxygenxml.com/doc/versions/17.1/ug-editor/index.html#concepts/dg-...
Can I create a MASTER document in Oxygen in which the chapters are INCLUDED (by mentionning their names) for the sake of processing the sum of the chapters?
Yes, you can use XInclude to bind all documents together in a single master file. This way you can transform the master that includes all other documents as if it's a single document.
http://www.oxygenxml.com/doc/versions/17.1/ug-editor/index.html#topics/inclu...
The examples from this section are for DocBook, but XInclude is supported by Oxygen independent from the XML format. Check your XInclude options from Oxygen (Options > Preferences, XML > "XML Parser", "XInclude Options"), they should be enabled by default.
(I currently run XSLT transformations on separate chapters for making indices? How to do that for the sum of the chapters?)
It's simpler to use the master file and runt the transformation just once on it.
Regards, Adrian
Adrian Buza oXygen XML Editor and Author Support
Tel: +1-650-352-1250 ext.2020 Fax: +40-251-461482 support@oxygenxml.com
On 04.12.2015 15:40, Jean-Luc Chevillard wrote:
Greetings!
I am currently editing several XML files which are the chapters of a Tamil thesaurus, and each file is dangerously close to the size limit connected with the script specificity. ("support for RTL languages" has to be activated, for proper display, and I have already increased the default size beyond which RTL language support is automatically deactivated)
I would appreciate pointers on the best methods for dealing with the larger entity, i.e. the sum of the chapters.
Can I create a MASTER document in Oxygen in which the chapters are INCLUDED (by mentionning their names) for the sake of processing the sum of the chapters?
(I currently run XSLT transformations on separate chapters for making indices? How to do that for the sum of the chapters?)
Thanks for any pointers to the appropriate section inside the Oxygen documentation (provided that it is possible to do that in Oxygen ...).
If the "RTL language" limit was not there, I would have used a single file but that does not seem to be possible.
-- Jean-Luc Chevillard (Paris-Pondicherry-Hamburg)
"https://univ-paris-diderot.academia.edu/JeanLucChevillard"
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
-- Wendell Piez | http://www.wendellpiez.com XML | XSLT | electronic publishing Eat Your Vegetables _____oo_________o_o___ooooo____ooooooo_^

Hello Wendell, thanks for your message. I have managed yesterday to use the method recommended by Adrian (to whom I also adress my thanks). This involved, among other things, transforming the internal DTD-s for the the chapters into an external DTD file, and verifying that they were all identical (which was in fact originally not the case, because I had been progressively enriching the structure; while editing the chapters one by one :-) I also had to diversify my XSLT strategy because the original XSLT script, which worked fine with the chapters, which are subtrees did not work with the global thesaurus tree, because the XPath references were too short ... But in the end it worked fine, and remained lightning fast, even when going through 6000 entries (instead of three times 2000 entries) [I have 3 chapters so far, and will have 10 chapters in the end] I have also been wondering whether I should write a DTD for the master document ..., (but have so far not tried) in the same way as I had to write yesterday evening a specific CSS for the master document. My question concerning the place where the "information" is "stored" was motivated by the fact that the series of videos (for the "older", SGML-like, method) showed buttons on which the virtual user was supposed to press, in order to ENABLE the master file support (and also because some menus indicated that the user had to choose between several "possible master files" and I imagined that the result of that customizing would be stored somewhere in a file hidden the cryptic place called "%APPDATA%\Roaming\com.oxygenxml" (or something of the sort .... I have not yet tried to SEE it ...) I am relieved to realize that many things are now much easier than they were, for people who work with complex scripts as Tamil (my first experience with this matter go back to the time, in the eighties, of writing Escape sequences for entering a user-defined "font" in the memory of a dot-matrix printer, at the time of Dos 2.0 We have gone a long way :-) Cheers -- Jean-Luc (Paris-Pondicherry-Hamburg) "https://univ-paris-diderot.academia.edu/JeanLucChevillard" "https://plus.google.com/u/0/113653379205101980081/posts/p/pub" "https://twitter.com/JLC1956" On 08/12/2015 16:52, Wendell Piez wrote:
Hi Jean-Luc,
I have a couple of comments regarding the XML part of your question (as opposed to the oXygen part). Some of this merely repeats what Adrian has said in more detail.
Both external parsed entities and XInclude are mechanisms designed, at least putatively (in the case of parsed entities) for your use case. However, they are very different. In understanding why we have two mechanisms and what accounts for their differences, it helps to know that external parsed entities predate the "Dawn of XML", being a part of the SGML standard (ISO 8879:1986) of which XML is a refinement. This means they (at least for your use case) are somewhat like cooking your dinner in the fireplace. It works, and some systems (perhaps even fine restaurants) still use and rely on the mechanism; but some professional chefs have never seen it done and wonder why you would do it this way when you have a stove.
The main difference between the mechanisms is that entity resolution takes place at "parse time", i.e. when a processor (a parser) reads XML markup (tags and text) and then does something with it.
XInclude resolution postpones the assembly of the composite document until a processing step after parsing. That is, the various components are parsed separately, yielding several "XML documents" (considered as tree structures in memory, no longer tags-and-text) which can then be assembled typically as one step in a transformation or processing pipeline that does other stuff as well (such as generate formatted output).
This is an important and useful distinction -- XInclude, in other words, takes advantage of modern architectures in which parsing is generic -- we don't configure separate parsing logic for every document, but instead use a commodity parser that produces a standardized result, which we can then process using XPath, XSLT etc.
In particular, since you are validating using a DTD, the XInclude mechanism may work better for you, as (among other things) it means you can continue to validate the fragments, as fragments, against the DTD even before or without XInclude resolution that assembles the composite document. (Using oXygen's "master document" feature you could perhaps work around this limitation by always validating the composite document even when working with a fragment; but XInclude is nevertheless more flexible.)
You asked about where the information is stored regarding how the master document and the included documents are related.
This will typically be in the master file itself ... using entities, in declarations (indeed these are formally part of the DTD but they are commonly managed in an internal DTD subset in the document prolog as Adrian showed -- or search for "XML external parsed entities" for more examples). Using XInclude, however, you simply embed XInclude elements in your master document that present references to other files in your system. (I.e.: these are a form of hypertext link, to be resolved in processing.)
<xi:include href="file.xml"/>
This means, pretty simply, "Include the XML here from 'file.xml'." It can get more complex - there can be a fallback if file.xml is ever missing, plus you can actually include fragments from within file.xml, etc. etc.
Using either mechanism, however, you will probably find things are pretty straightforward once you know about the knobs and switches -- until you hit the problem of internal cross-references within your assembly, which becomes somewhat more complex to manage as soon as your document is in several pieces. The bottom line here is that using XInclude, you can no longer rely on ID/IDREF attributes (declared in your DTD) to help ensure the integrity of your cross-references, since these attributes must now be able to point to elements across file boundaries. Of course, there are ways of dealing with this too....
Cheers, Wendell
On Mon, Dec 7, 2015 at 10:57 AM, Jean-Luc Chevillard <jeanluc.chevillard@gmail.com> wrote:
Hello Adrian, (resent, with copy to the mailing list)
thanks for your explanations.
My XML files are "custom" and I have defined a DTD and a CSS myself, which I am progressively enriching, as my understanding of the complexity of that particular thesaurus grows.
In my creation of the Thesaurus I alternate between "Text Mode" for certain (complex) tasks and "Author Mode" for (easy) tasks.
I had not realized until now that the RLT file size constraints are not that stringent in Author Mode. (BTW, Tamil is not a "RTL language" but rather a script with complex rendering ...)
Since I asked the question, I have discovered the following video on the Oxygen Web site, "http://oxygenxml.com/demo/Working_With_XML_Modules.html"
Using your answer and the video, I should (hopefully) have no difficulty trying to use that method.
I have however a remaining question: -- I expect the linkage between the "master document" and the "included documents" to be described in some auxiliary file, which will tell Oxygen where to look for the DTD, etc. when it does "content completion" -- WHERE will those auxiliary files be located? (will they be in a hidden location) How can I do backup for them? -- WHAT HAPPENS when I upgrade to the next version of Oxygen? Will the auxiliary files be destroyed?
Thanks for clarifying those points
Best wishes
-- Jean-Luc Chevillard (Paris)
"https://univ-paris-diderot.academia.edu/JeanLucChevillard"
"https://plus.google.com/u/0/113653379205101980081/posts/p/pub"
On 07/12/2015 15:32, Oxygen XML Editor Support (Adrian Buza) wrote:
Hello,
What type of XML files are these (DITA, DocBook, TEI or custom)?
The "support for RTL languages" is problematic for large files in Text mode. But Author mode (Document > Edit Mode > Author) can handle RTL content a lot better, so that's a possible solution if you want to work with larger files. However, if you have a custom type of document thhat Oxuygen doesn't support out-of-the-box, you'll have to create a custom CSS, so that the document is represented properly in Author mode:
http://www.oxygenxml.com/doc/versions/17.1/ug-editor/index.html#concepts/dg-...
Can I create a MASTER document in Oxygen in which the chapters are INCLUDED (by mentionning their names) for the sake of processing the sum of the chapters? Yes, you can use XInclude to bind all documents together in a single master file. This way you can transform the master that includes all other documents as if it's a single document.
http://www.oxygenxml.com/doc/versions/17.1/ug-editor/index.html#topics/inclu...
The examples from this section are for DocBook, but XInclude is supported by Oxygen independent from the XML format. Check your XInclude options from Oxygen (Options > Preferences, XML > "XML Parser", "XInclude Options"), they should be enabled by default.
(I currently run XSLT transformations on separate chapters for making indices? How to do that for the sum of the chapters?) It's simpler to use the master file and runt the transformation just once on it.
Regards, Adrian
Adrian Buza oXygen XML Editor and Author Support
Tel: +1-650-352-1250 ext.2020 Fax: +40-251-461482 support@oxygenxml.com
On 04.12.2015 15:40, Jean-Luc Chevillard wrote:
Greetings!
I am currently editing several XML files which are the chapters of a Tamil thesaurus, and each file is dangerously close to the size limit connected with the script specificity. ("support for RTL languages" has to be activated, for proper display, and I have already increased the default size beyond which RTL language support is automatically deactivated)
I would appreciate pointers on the best methods for dealing with the larger entity, i.e. the sum of the chapters.
Can I create a MASTER document in Oxygen in which the chapters are INCLUDED (by mentionning their names) for the sake of processing the sum of the chapters?
(I currently run XSLT transformations on separate chapters for making indices? How to do that for the sum of the chapters?)
Thanks for any pointers to the appropriate section inside the Oxygen documentation (provided that it is possible to do that in Oxygen ...).
If the "RTL language" limit was not there, I would have used a single file but that does not seem to be possible.
-- Jean-Luc Chevillard (Paris-Pondicherry-Hamburg)
"https://univ-paris-diderot.academia.edu/JeanLucChevillard"
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user

Hi Jean-Luc, We have indeed come a very long way! Your intuition about oXygen's master files was not incorrect, there is indeed a place where oXygen stores the "Master File" configuration info it keeps. But as it happens (for historical reasons, since as you are seeing, your question is an old one indeed!), you don't have to worry about it -- this time -- since using XInclude you can proceed to build what you need without resorting to the oXygen mechanism. (Setting up a Master File may remain useful for other reasons.) As far as DTDs go, your development process sounds good: the refactoring you are now doing is exactly the way things are best done, since by consolidating all your documents under one DTD, you'll be able to assert more regular tagging practices across them -- which is where you get leverage in processing. While this may be obvious :-) it's also worth restating since it's part of the fun. In the kind of architecture you are building, it is common for both the composite document, and each of the component documents, to validate against the same DTD, although (since modular DTD architectures are also possible) another way to do it is with two separate DTDs, where one (CompositeDTD) is a formal superset of the other (ComponentDTD), including its declarations by reference. Cheers, Wendell On Tue, Dec 8, 2015 at 12:14 PM, Jean-Luc Chevillard <jeanluc.chevillard@gmail.com> wrote:
Hello Wendell,
thanks for your message.
I have managed yesterday to use the method recommended by Adrian (to whom I also adress my thanks).
This involved, among other things, transforming the internal DTD-s for the the chapters into an external DTD file, and verifying that they were all identical (which was in fact originally not the case, because I had been progressively enriching the structure; while editing the chapters one by one :-)
I also had to diversify my XSLT strategy because the original XSLT script, which worked fine with the chapters, which are subtrees did not work with the global thesaurus tree, because the XPath references were too short ...
But in the end it worked fine, and remained lightning fast, even when going through 6000 entries (instead of three times 2000 entries) [I have 3 chapters so far, and will have 10 chapters in the end]
I have also been wondering whether I should write a DTD for the master document ..., (but have so far not tried) in the same way as I had to write yesterday evening a specific CSS for the master document.
My question concerning the place where the "information" is "stored" was motivated by the fact that the series of videos (for the "older", SGML-like, method) showed buttons on which the virtual user was supposed to press, in order to ENABLE the master file support (and also because some menus indicated that the user had to choose between several "possible master files" and I imagined that the result of that customizing would be stored somewhere in a file hidden the cryptic place called "%APPDATA%\Roaming\com.oxygenxml" (or something of the sort .... I have not yet tried to SEE it ...)
I am relieved to realize that many things are now much easier than they were, for people who work with complex scripts as Tamil (my first experience with this matter go back to the time, in the eighties, of writing Escape sequences for entering a user-defined "font" in the memory of a dot-matrix printer, at the time of Dos 2.0
We have gone a long way :-)
Cheers
-- Jean-Luc (Paris-Pondicherry-Hamburg)
"https://univ-paris-diderot.academia.edu/JeanLucChevillard"
"https://plus.google.com/u/0/113653379205101980081/posts/p/pub"
On 08/12/2015 16:52, Wendell Piez wrote:
Hi Jean-Luc,
I have a couple of comments regarding the XML part of your question (as opposed to the oXygen part). Some of this merely repeats what Adrian has said in more detail.
Both external parsed entities and XInclude are mechanisms designed, at least putatively (in the case of parsed entities) for your use case. However, they are very different. In understanding why we have two mechanisms and what accounts for their differences, it helps to know that external parsed entities predate the "Dawn of XML", being a part of the SGML standard (ISO 8879:1986) of which XML is a refinement. This means they (at least for your use case) are somewhat like cooking your dinner in the fireplace. It works, and some systems (perhaps even fine restaurants) still use and rely on the mechanism; but some professional chefs have never seen it done and wonder why you would do it this way when you have a stove.
The main difference between the mechanisms is that entity resolution takes place at "parse time", i.e. when a processor (a parser) reads XML markup (tags and text) and then does something with it.
XInclude resolution postpones the assembly of the composite document until a processing step after parsing. That is, the various components are parsed separately, yielding several "XML documents" (considered as tree structures in memory, no longer tags-and-text) which can then be assembled typically as one step in a transformation or processing pipeline that does other stuff as well (such as generate formatted output).
This is an important and useful distinction -- XInclude, in other words, takes advantage of modern architectures in which parsing is generic -- we don't configure separate parsing logic for every document, but instead use a commodity parser that produces a standardized result, which we can then process using XPath, XSLT etc.
In particular, since you are validating using a DTD, the XInclude mechanism may work better for you, as (among other things) it means you can continue to validate the fragments, as fragments, against the DTD even before or without XInclude resolution that assembles the composite document. (Using oXygen's "master document" feature you could perhaps work around this limitation by always validating the composite document even when working with a fragment; but XInclude is nevertheless more flexible.)
You asked about where the information is stored regarding how the master document and the included documents are related.
This will typically be in the master file itself ... using entities, in declarations (indeed these are formally part of the DTD but they are commonly managed in an internal DTD subset in the document prolog as Adrian showed -- or search for "XML external parsed entities" for more examples). Using XInclude, however, you simply embed XInclude elements in your master document that present references to other files in your system. (I.e.: these are a form of hypertext link, to be resolved in processing.)
<xi:include href="file.xml"/>
This means, pretty simply, "Include the XML here from 'file.xml'." It can get more complex - there can be a fallback if file.xml is ever missing, plus you can actually include fragments from within file.xml, etc. etc.
Using either mechanism, however, you will probably find things are pretty straightforward once you know about the knobs and switches -- until you hit the problem of internal cross-references within your assembly, which becomes somewhat more complex to manage as soon as your document is in several pieces. The bottom line here is that using XInclude, you can no longer rely on ID/IDREF attributes (declared in your DTD) to help ensure the integrity of your cross-references, since these attributes must now be able to point to elements across file boundaries. Of course, there are ways of dealing with this too....
Cheers, Wendell
On Mon, Dec 7, 2015 at 10:57 AM, Jean-Luc Chevillard <jeanluc.chevillard@gmail.com> wrote:
Hello Adrian, (resent, with copy to the mailing list)
thanks for your explanations.
My XML files are "custom" and I have defined a DTD and a CSS myself, which I am progressively enriching, as my understanding of the complexity of that particular thesaurus grows.
In my creation of the Thesaurus I alternate between "Text Mode" for certain (complex) tasks and "Author Mode" for (easy) tasks.
I had not realized until now that the RLT file size constraints are not that stringent in Author Mode. (BTW, Tamil is not a "RTL language" but rather a script with complex rendering ...)
Since I asked the question, I have discovered the following video on the Oxygen Web site, "http://oxygenxml.com/demo/Working_With_XML_Modules.html"
Using your answer and the video, I should (hopefully) have no difficulty trying to use that method.
I have however a remaining question: -- I expect the linkage between the "master document" and the "included documents" to be described in some auxiliary file, which will tell Oxygen where to look for the DTD, etc. when it does "content completion" -- WHERE will those auxiliary files be located? (will they be in a hidden location) How can I do backup for them? -- WHAT HAPPENS when I upgrade to the next version of Oxygen? Will the auxiliary files be destroyed?
Thanks for clarifying those points
Best wishes
-- Jean-Luc Chevillard (Paris)
"https://univ-paris-diderot.academia.edu/JeanLucChevillard"
"https://plus.google.com/u/0/113653379205101980081/posts/p/pub"
On 07/12/2015 15:32, Oxygen XML Editor Support (Adrian Buza) wrote:
Hello,
What type of XML files are these (DITA, DocBook, TEI or custom)?
The "support for RTL languages" is problematic for large files in Text mode. But Author mode (Document > Edit Mode > Author) can handle RTL content a lot better, so that's a possible solution if you want to work with larger files. However, if you have a custom type of document thhat Oxuygen doesn't support out-of-the-box, you'll have to create a custom CSS, so that the document is represented properly in Author mode:
http://www.oxygenxml.com/doc/versions/17.1/ug-editor/index.html#concepts/dg-...
Can I create a MASTER document in Oxygen in which the chapters are INCLUDED (by mentionning their names) for the sake of processing the sum of the chapters?
Yes, you can use XInclude to bind all documents together in a single master file. This way you can transform the master that includes all other documents as if it's a single document.
http://www.oxygenxml.com/doc/versions/17.1/ug-editor/index.html#topics/inclu...
The examples from this section are for DocBook, but XInclude is supported by Oxygen independent from the XML format. Check your XInclude options from Oxygen (Options > Preferences, XML > "XML Parser", "XInclude Options"), they should be enabled by default.
(I currently run XSLT transformations on separate chapters for making indices? How to do that for the sum of the chapters?)
It's simpler to use the master file and runt the transformation just once on it.
Regards, Adrian
Adrian Buza oXygen XML Editor and Author Support
Tel: +1-650-352-1250 ext.2020 Fax: +40-251-461482 support@oxygenxml.com
On 04.12.2015 15:40, Jean-Luc Chevillard wrote:
Greetings!
I am currently editing several XML files which are the chapters of a Tamil thesaurus, and each file is dangerously close to the size limit connected with the script specificity. ("support for RTL languages" has to be activated, for proper display, and I have already increased the default size beyond which RTL language support is automatically deactivated)
I would appreciate pointers on the best methods for dealing with the larger entity, i.e. the sum of the chapters.
Can I create a MASTER document in Oxygen in which the chapters are INCLUDED (by mentionning their names) for the sake of processing the sum of the chapters?
(I currently run XSLT transformations on separate chapters for making indices? How to do that for the sum of the chapters?)
Thanks for any pointers to the appropriate section inside the Oxygen documentation (provided that it is possible to do that in Oxygen ...).
If the "RTL language" limit was not there, I would have used a single file but that does not seem to be possible.
-- Jean-Luc Chevillard (Paris-Pondicherry-Hamburg)
"https://univ-paris-diderot.academia.edu/JeanLucChevillard"
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
-- Wendell Piez | http://www.wendellpiez.com XML | XSLT | electronic publishing Eat Your Vegetables _____oo_________o_o___ooooo____ooooooo_^
participants (3)
-
Jean-Luc Chevillard
-
Oxygen XML Editor Support (Adrian Buza)
-
Wendell Piez