
I want to run a simple XSLT script against each of about 50,000 small RDF files, sending the output to a single file. I can do it at the command line, but I'd like it to complete within my lifetime, or preferably before lunch. Is there a better way than saying "for f in *.rdf; do saxon $f mystylesheet.xsl >> output.xml; done" ?

Hi Lou, Each XSLT transformation scenario will output to its own file location. So you would still need to merge all those output files. How about if your XSLT uses the "collection" function to load all XML documents in a certain folder and then use those nodes to produce the output? Something like James Cummings does here: https://blogs.it.ox.ac.uk/jamesc/2009/02/10/xslt2-collection-with-dynamic-co... Regards, Radu Radu Coravu <oXygen/> XML Editor http://www.oxygenxml.com On 3/12/2018 2:54 PM, Lou Burnard wrote:
I want to run a simple XSLT script against each of about 50,000 small RDF files, sending the output to a single file. I can do it at the command line, but I'd like it to complete within my lifetime, or preferably before lunch. Is there a better way than saying "for f in *.rdf; do saxon $f mystylesheet.xsl >> output.xml; done" ?
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user

Hi Radu Well not quite, since the >> appends each output file to the same location, but that doesn't speed anything up. Thanks for the tip about using the collection function though! Will give it a try. After lunch. Lou On 12/03/18 13:47, Oxygen XML Editor Support (Radu Coravu) wrote:
Hi Lou,
Each XSLT transformation scenario will output to its own file location. So you would still need to merge all those output files.
How about if your XSLT uses the "collection" function to load all XML documents in a certain folder and then use those nodes to produce the output? Something like James Cummings does here:
https://blogs.it.ox.ac.uk/jamesc/2009/02/10/xslt2-collection-with-dynamic-co...
Regards, Radu
Radu Coravu <oXygen/> XML Editor http://www.oxygenxml.com
On 3/12/2018 2:54 PM, Lou Burnard wrote:
I want to run a simple XSLT script against each of about 50,000 small RDF files, sending the output to a single file. I can do it at the command line, but I'd like it to complete within my lifetime, or preferably before lunch. Is there a better way than saying "for f in *.rdf; do saxon $f mystylesheet.xsl >> output.xml; done" ?
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user

I think what’s expensive here is that you spin up a JVM for every file. You can prevent this by passing a directory to the Saxon CLI as input parameter which should be way faster. (You’d need to concat the results in a second step, then) Best Peter
Am 12.03.2018 um 13:54 schrieb Lou Burnard <lou.burnard@retired.ox.ac.uk>:
I want to run a simple XSLT script against each of about 50,000 small RDF files, sending the output to a single file. I can do it at the command line, but I'd like it to complete within my lifetime, or preferably before lunch. Is there a better way than saying "for f in *.rdf; do saxon $f mystylesheet.xsl >> output.xml; done" ?
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
-- Peter Stadler Carl-Maria-von-Weber-Gesamtausgabe Arbeitsstelle Detmold Hornsche Str. 39 D-32756 Detmold Tel. +49 5231 975-676 Fax: +49 5231 975-668 stadler at weber-gesamtausgabe.de www.weber-gesamtausgabe.de

Now that *is* really cool. Many thanks (again) Peter. On 12/03/18 13:54, Peter Stadler wrote:
I think what’s expensive here is that you spin up a JVM for every file. You can prevent this by passing a directory to the Saxon CLI as input parameter which should be way faster. (You’d need to concat the results in a second step, then)
Best Peter
Am 12.03.2018 um 13:54 schrieb Lou Burnard <lou.burnard@retired.ox.ac.uk>:
I want to run a simple XSLT script against each of about 50,000 small RDF files, sending the output to a single file. I can do it at the command line, but I'd like it to complete within my lifetime, or preferably before lunch. Is there a better way than saying "for f in *.rdf; do saxon $f mystylesheet.xsl >> output.xml; done" ?
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user -- Peter Stadler Carl-Maria-von-Weber-Gesamtausgabe Arbeitsstelle Detmold Hornsche Str. 39 D-32756 Detmold Tel. +49 5231 975-676 Fax: +49 5231 975-668 stadler at weber-gesamtausgabe.de www.weber-gesamtausgabe.de

Check this previous email. I think that it is what you are looking for. You can use XProc to batch process over each file in a directory. https://www.oxygenxml.com/pipermail/oxygen-user/2009-October/002831.html Loren Cahlander
On Mar 12, 2018, at 8:54 AM, Lou Burnard <lou.burnard@retired.ox.ac.uk> wrote:
I want to run a simple XSLT script against each of about 50,000 small RDF files, sending the output to a single file. I can do it at the command line, but I'd like it to complete within my lifetime, or preferably before lunch. Is there a better way than saying "for f in *.rdf; do saxon $f mystylesheet.xsl >> output.xml; done" ?
_______________________________________________ oXygen-user mailing list oXygen-user@oxygenxml.com https://www.oxygenxml.com/mailman/listinfo/oxygen-user
participants (4)
-
Loren Cahlander
-
Lou Burnard
-
Oxygen XML Editor Support (Radu Coravu)
-
Peter Stadler