Re: [basex-talk] Navigating a DOM

23 Oct 2012


      Dear Rainer,
...
I have a really large XML file which does not fit into memory, and I
would like to navigate it as a DOM. My hope was that I could store it as
a BaseX database, retrieve the root element as a org.w3c.dom.Node, and
then start navigating down and up the DOM as needed without having to
have the whole stuff in memory.
By accident, a previous version of BaseX was working as doing exactly
what you were describing. In more recent versions, the DOM node is
completely materialized in memory, because lazy processing was causing
too many unwanted side effects regarding concurrency and node caching.
While the resulting representation takes less space than the original
Java DOM representation, and is faster in many cases, it still takes
about 2-3 times of the size of the textual representation.
What you can do, however, and what we regularly do, is using our
internal node representation. A small example is shown in the
following:
Context context = new Context();
    QueryProcessor processor =
        new QueryProcessor("doc('catalog')/*", context);
    context.register(processor);
    Iter iter = processor.iter();
    Item item = iter.next();
    if(item instanceof ANode) {
      ANode node = (ANode) item;
      System.out.println("Name: " + node.qname());
      for(final ANode child : node.children()) {
        System.out.println("- Child: " + child);
      }
    }
    processor.close();
    context.unregister(processor);
    context.close();
Please remember to close the processor after having requested all
nodes; otherwise, the database will be kept open. Using
context.register(), you can be sure that no other write operation will
modify your data as long as you're requesting it. If concurrency is no
issue, feel free to remove the (un)register calls.
...
And I had quite a tough time fiddling around with the documentation and
with the JavaDoc. While the documentation puts a lot of effort into
XQuery, it remains unclear to some extend how to do some basic stuff
with BaseX programmatically. This is a hurdle for the BaseX beginner.
Absolutely true; our documentation is rather sparse when it comes to
our internal low level API, and we are well aware that many of our
users would benefit from some more brain food reg. our architecture.
As a matter of fact, writing a good documentation takes a lot of
resources, which is why we are always thankful for external
contributions.
Still, we are doing our best to document our source code as good as
possible. It may help a lot when you want to leave our high-level
APIs, such as the client APIs and XQJ.
Christian

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [basex-talk] Navigating a DOM