Hi Martin,
Yes, a full XDM tree will be created if an input is supplied via -i on command line. The resulting XDM representation is definitely more light-weight than the standard DOM tree (it’s an array-based main-memory version of the database storage), but obviously there’ll always be a limit.
There is one way out, though, and it’s powerful: You can create a temporary database instance [1] and drop it after the evaluation of your query:
basex.bat -c "create db tmp input.xml" query.xq -c "drop db tmp"
Some query optimizations are only performed on databases, so your query might be executed even faster than before.
Hope this helps, Christian
[1] http://docs.basex.org/wiki/Commands#CREATE_DB
On Wed, Oct 2, 2019 at 3:59 PM Martin Honnen martin.honnen@gmx.de wrote:
If I use BaseX (9.2.4) simply as an XQuery 3.1 processor from the command line with e.g.
basex.bat -i input.xml query.xq
does BaseX then first always parse the input.xml into an XDM tree, meaning if I use huge input documents that way I can run easily run out of memory?
Or does that depend on the type of query or some other settings?
I was wondering whether a "tumbling window" based split algorithm like
declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";
declare option output:method 'xml'; declare option output:indent 'yes';
declare variable $chunk-size as xs:integer external := 500;
for tumbling window $chunk in /*/* start at $sp when $sp mod $chunk-size = 1 count $p return put(document { element { node-name(head($chunk)/..) } { $chunk } }, 'xquery-split-result-' || $p || '.xml')
would run/work without memory problems for huge inputs.