Yes, I wouldn't expect the grammars to chew up gigabytes. I'll provide a test data set for you.
Cheers,
E.
-- Eliot Kimber http://contrext.com
On 5/14/18, 12:45 PM, "Christian Grün" christian.gruen@gmail.com wrote:
I would have expected some MBs to be sufficient for parsing even complex DTDs if nothing is cached (but caching could definitely speed up processing), so maybe there’s still something that we could have a look at. If you are interested, feel free to provide me with your files via a private message.
On Mon, May 14, 2018 at 7:40 PM, Eliot Kimber ekimber@contrext.com wrote: > Yes, I would want caching on by default with the option to turn it off. I'm assuming it's currently not turned on (but to be honest I haven't taken the time to check the source code). > > Certainly for DITA content grammar caching is the only practical way to parse a large number of topics in the same JVM without both using lots of memory and eating an avoidable processing cost of re-processing the grammar files again for each document. > > DITA is probably somewhat unique in this regard because it takes a such a different approach to grammar organization and use than pretty much any other XML application. > > Cheers, > > E.