
On 17/11/2015 4:30 pm, oXygen XML Editor Blog wrote:
oXygen XML Editor Blog
/////////////////////////////////////////// Possibilities to obtain PDF from DITA
Posted: 16 Nov 2015 01:07 PM PST http://feedproxy.google.com/~r/AboutOxygenXmlEditor/~3/sXZMCVP56Pc/possibilities-to-obtain-pdf-from-dita.html?utm_source=feedburner&utm_medium=email
[Note: this began as a comment on the blog, but Blogger hates me and may have eaten it, so here it is.] Another reason is that Apache's FOP suffers all sorts of bugs (as we've discussed before) and clearly the timeframe on resolving these hasn't been good enough for the companies that need it, so the market responded. From the looks of things a lot of the solutions are built on .NET, so there's still room for more entrants in the market if they can meet the needs of the non-Microsoft market. Also, there are other methods to get what people want (and no one really cares what the method is if it works and can be entirely automated). DITA to [X]HTML + CSS then generate the PDF using wkhtmltopdf. The advantage is it just works. The disadvantage is that the webkit based solution Google used is built on Qt 4.8 and, like many browsers, it does not render the text-justify style in CSS. If you want justified text in print, you'll need another solution. Since LibreOffice can utilise XSL files and take instructions via the command line, it may be quite possible to use that in conjunction with any OpenDocument template (.ott files), though I haven't tried it yet. It's use of XSL files require them being loaded through the GUI first. Alternatively there may be an avenue to ignore PDF entirely by implementing Microsoft's Open XML Paper Specification (.oxps files), but it probably requires implementing vast swathes of the .NET framework in order to provide XAML support in a platform independent way (a shame since OpenXPS looked interesting and having seen what PDF really is underneath an alternative is always appealing). Another option once existing transformations to HTML or DOCX are included is to tap pandoc again and use its PDF generation. That utilises LaTeX, however, and looks quite different to most other PDFs. Controlling that layout requires a detailed understanding of LaTeX's syntax. I have yet to see any solution which utilises Apple's Quartz Core framework on which OS X is built. Which is kind of odd given how extensive XML is in OS X configuration and that PDF is one of the true essential foundations of Quartz. It's quite possible that everything is already in place, but Apple just hasn't told anyone because that's their secret. They can be a bit weird sometimes. I haven't been too worried about PDF or print so far, but if that changes and spending thousands of dollars isn't an option (and it isn't). I suspect I'd end up finding the best way to feed documents back to LibreOffice and let it print them or produce the PDF, depending on which path required more fiddling. It might even be possible to get LibreOffice to process the output of the com.oxygenxml.pdf.css plugin. If that works then you've got a free framework that runs on OS X, Linux, Windows, Solaris, BSD and other bits and pieces. Admittedly it's a 660MB framework, but that's still smaller than a full LaTeX installation. ... oh damn, now I'm curious ... [at this point some time passed] ... Okay, the quickest and easiest method is to use LibreOffice with the built-in DITA Map to ODF plugin and then use this command on the generated file: [/path/to/]soffice --headless --convert-to pdf filename.odt Change the path to the soffice binary to whatever is relevant for your platform. For OS X users that should be: /Applications/LibreOffice.app/Contents/MacOS/soffice --headless --convert-to pdf filename.odt It may be preferable to adjust the OpenDocument template used with the generated file first and assuming the DITA-OT plugin is already using an ODF template it shouldn't be too difficult to make a duplicate scenario allowing alternative templates to be specified. If that is the case, that's everything. Some quick grepping, however, indicates that the default ODF output is entirely XSL driven, so chances are modifications will involve adjusting the ./xsl/xslodt/dita2odtstyles.xsl file in the plugin. This may be an issue if, for example, you think using Courier for footnotes is just annoying when all the rest of the text uses a variable width sans serif font. If XSL editing is to be avoided at all costs then convert to DOCX, Docbook or a single page HTML file (if you leave them separate you will have a lot of typing to do) and use that with pandoc as pandoc can accept an ODF template on the command line with the --reference option. Using this will completely override existing formatting so if you just want to adjust the current template's chosen fonts in the DITA-OT plugin via the Pandoc method, use LibreOffice to create a new template based on the style of a document generated with the DITA-OT plugin (it's possible within LO to import styles from any file, whether it's a template or not, which can be used to make a new template based on it - which means if you want to nick the style used in the LibreOffice documentation then you can). Since LibreOffice can't use an ODF template via the headless command (very annoying, but can't be helped yet) the prize here goes to the first person or group who converts the existing DITA Map to ODF plugin to utilise an ODF template or, more likely, to create an ODF to DITA transformation which includes an option to generate XSL files to turn replace the current style XSL files (so ODF to XSL via an XSLT) since there's no real difference between an ODT and an OTT other than how the office suite(s) treat them. The pandoc variant will extend this to work with DOCX and its template files too, but I only use DOCX as a stepping stone between various formats (it appears to be the "gateway drug" for all file formats, at least until DITA can cover everything). Anyway, the important point is that it does work and will work on any system with either LibreOffice or OpenOffice.org and possibly with Abiword (I haven't used that one, so can't confirm). For my part I'm trying to avoid PDF where possible (in part because I started reading the spec earlier this year for reasons better left to the mists of time, so I'd just go for the quick and easy convert to ODT and then adjust the styles before printing or exporting to PDF. Either that or spend a day or two fiddling to get the builtin plugin to use better fonts and then leave it as is. It probably needs some tweaking anyway, though since it didn't include things like a page break between the cover page and the copyright page which was a little surprising, no doubt the more time spent on it the less suckage it will spout (to a limited extent with PDF as the end point). Regards, Ben P.S. You can cheat by generating an ePub and then converting that to PDF with the consumer's choice for file conversion: Calibre. Just don't rely on it because there are zero guarantees, it was never intended for anything other than managing personal ebook collections and shifting between devices (which is why I turned up in the first place, well due to that and Sigil breaking ePub 3.0 files). I'd use it to test drafts and the like, but never in production.