Thanks Arto for the interesting comparisons. One more rewriting for map2 could be.. declare function local:map2($a, $b) { let $m2 := map:new($b ! map:entry(., true())) return $a[$m2(.)] }; ..but my assumption is that all rewritings should yield similar performance, because the flwor expression, predicates and the map operator are internally basically evaluated the same. @Andy:
distinct-values(($arg1, $arg2)) distinct-values($arg1[.=$arg2]) distinct-values($arg1[not(.=$arg2)])
the first query should be fast enough, but it may be beneficial to optimize the (non-)equality test in the next two queries. I’m not sure, though, if this optimization can be generalized that easily. ___________________________ 2013/9/26 Arve Gengelbach <ag@basex.org>:
I had a typo in my example. The return did not call the map2 version. With it, the results were (now on 7.7 and Windows 7)
And I posted the wrong version. Usage of ! within map:new() is not as fast as a flowr expression (in general).
BUT neither of the “fastest two” functions yields valid results (e.g. take 2,2,3 for $a and empty sequence for $b) Arto, I hope replacing one flowr expression with the simple map operator can improve speed in your use-cases. So hopefully this is fastest for you:
declare function local:difference-maps($a, $b) { let $m1 := map:new(for $i in $a return map:entry($i, true())) let $m2 := map:new(for $i in $b return map:entry($i, false())) let $m3 := map:new(($m1, $m2)) return map:keys($m3) ! (if ($m3(.)) then . else ()) };
As the speed gain depends on the spreading of the data, rewriting set operations needs proove of efficency for any given data. _______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk