On Sat, Apr 16, 2022 at 09:10:16PM +0200, Markus Elfring scripsit:
How do you think about to take another look at the development challenges according to growing numbers of binding sequences and corresponding join conditions?
I think there's usually four steps:
Did you stumble on bigger numbers of binding sequences for which you would like to join some information?
Generally the pattern for that is:
1 process each XPath expression into a sequence of maps; ideally there's a common function you pass a sequence of nodes, but
local:mapify($found as node()*) as map(*)* { ... stuff happens.... };
can have per-expression variants if necessary.
2 now you've got multiple sequences of maps; make it one sequence using the comma operator or by processing all your XPath expressions as inputs to something that gets the node sequence from the various XPath for you.
For purposes of this example, call that combined sequence of all the sequences of maps $everything which has type map(*)*.
3 Abstract:
let $together as map(*) := map:merge( for $key in ($everything ! map:keys(.)) => distinct-values() return map:entry($key,$everything[map:keys(.) eq $key] ! .($key)) )
(syntax warning; I typed that, I didn't run it.)
So you wind up with one map referencing everything your XPath expressions found.
This won't inherently keep information like where the node came from in the map, but it's still the original node by reference; something like base-uri() can tell you where the node originates if you need to know that.
If all you want is the result from the mapify function, you've got all of them and can do any subsequent processing that's appropriate.
Would you like to share any ideas for further extensions of the involved programming interfaces?
Have you got a specific problem?
I am looking for another bit of clarification according to a general data processing task like joining information from several sources (when their size and number would become remarkable).
First question with BaseX is "do I already have the database, or am I creating it?"
It's hard to have so much data you need to get clever. Generally, creating multiple databases will solve most scale problems.