Re: [basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI

14 May 2018

      Good to know; I’ll record this as positive news ;) Feel free to give
me an update once you encounter a similar behavior.
On Mon, May 14, 2018 at 8:40 PM, Eliot Kimber ekimber@contrext.com wrote:
...
Hmm.
In the process of testing my test data set I can't reproduce the earlier behavior.
In my current tests, using the same data and the same BaseX version, I get a maximum of maybe 1GB for the largest file but just a few hundred MBs once everything is loaded.
For 3800 topics of roughly 50K each (on average) it takes just a couple of seconds to load them with no DTDs, a minute or so with DTDs, which is consistent with the time cost of reparsing the (large) DITA grammars for each topic.
So not sure what was happening when I tried this before but I definitely rebooted and installed macOS updates since then, so could have been some Java issue or who knows what else.
The good news is that even without grammar caching the DITA topics do load in a reasonable (if not ideal) amount of time and with appropriate memory usage.
Cheers,
E.
--
Eliot Kimber
http://contrext.com
On 5/14/18, 12:53 PM, "Eliot Kimber" <basex-talk-bounces@mailman.uni-konstanz.de on behalf of ekimber@contrext.com> wrote:
Yes, I wouldn't expect the grammars to chew up gigabytes. I'll provide a test data set for you.

Cheers,

E.

--
Eliot Kimber
http://contrext.com

On 5/14/18, 12:45 PM, "Christian Grün" <christian.gruen@gmail.com> wrote:

    I would have expected some MBs to be sufficient for parsing even
    complex DTDs if nothing is cached (but caching could definitely speed
    up processing), so maybe there’s still something that we could have a
    look at. If you are interested, feel free to provide me with your
    files via a private message.

    On Mon, May 14, 2018 at 7:40 PM, Eliot Kimber <ekimber@contrext.com> wrote:
    > Yes, I would want caching on by default with the option to turn it off. I'm assuming it's currently not turned on (but to be honest I haven't taken the time to check the source code).
    >
    > Certainly for DITA content grammar caching is the only practical way to parse a large number of topics in the same JVM without both using lots of memory and eating an avoidable processing cost of re-processing the grammar files again for each document.
    >
    > DITA is probably somewhat unique in this regard because it takes a such a different approach to grammar organization and use than pretty much any other XML application.
    >
    > Cheers,
    >
    > E.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI