The snapshot executes both versions of my recursive function (one with 'element()*' and one with 'item()*' for the sequence of elements) equally fast, which is to say in about 4s. I verified that 9.1.2 executes the 'item()*' one in about the same time, but the 'element()*' one drags on for a few minutes before I stop it.
Note that I exit & restart BaseX between tests.
Thanks Christian!
On Fri, Apr 12, 2019 at 1:12 PM Christian GrĂ¼n christian.gruen@gmail.com wrote:
A new snapshot is online! Looking forward to feedback.
On Thu, Apr 11, 2019 at 12:10 AM Chuck Bearden cfbearden@gmail.com wrote:
BaseX is a great tool for analyzing & characterizing large amounts of XML data. I have used it both at work and on personal projects. I hope the following observation is useful.
When I define a function that recurs over a sequence of elements in order to build a map of element name counts, I find that when I specify the type of the element sequence as 'element()*', the function runs so slowly that I give up after 5 minutes or so. But when I specify the type as 'item()*', it finishes in 40 seconds or less. Here's an example:
-----begin code snippet----- declare namespace local="w00fw00f"; declare function local:count($elems as element()*, $elem_counts as map(*)) as map(*) { let $elem := head($elems), $elem_name := $elem/name(), $elems_new := tail($elems), $elem_name_count := if (map:contains($elem_counts, $elem_name)) then map:get($elem_counts, $elem_name) + 1 else 1, $elem_counts_new := map:put($elem_counts, $elem_name, $elem_name_count) return if (count($elems_new) = 0) then $elem_counts_new else local:count($elems_new, $elem_counts_new) };
let $coll := collection('pure_20190402'), $elems := $coll/result/items/*, $elem_names_map := local:count($elems, map {}) return json:serialize($elem_names_map, map {'format' : 'xquery'}) -----end code snippet-----
In the function declaration, changing "$elems as element()*" to "$elems as item()*" makes the difference in performance. Replacing the JSON serialization with a standard XML one does not change the performance. I am running BaseX 9.1.2 under Ubuntu 16.04.6.
All the best, Chuck Bearden