
Searching for data is also really slow. The kind of thing we want to do is to locate, say, a student with a particular HUSID to check data errors discovered for that student's entry.
Entering an XPath query like (with an, obviously, invalid HUSID):
(/Institution/Student/HUSID[.='999999999999'])
... took several minutes, with the interface freezing after the entry of each forward slash.
If the XML is basically a repeating structure, you will find splitting it into smaller chunks is more manageable that handling one massive file. Equally, if you want run XPaths against it then you should really be using an XML database (such as eXist), or investigate streaming options. -- Andrew Welch http://andrewjwelch.com Kernow: http://kernowforsaxon.sf.net/