Thanks Arto for the interesting comparisons.
One more rewriting for map2 could be..
declare function local:map2($a, $b) { let $m2 := map:new($b ! map:entry(., true())) return $a[$m2(.)] };
..but my assumption is that all rewritings should yield similar performance, because the flwor expression, predicates and the map operator are internally basically evaluated the same.
@Andy:
distinct-values(($arg1, $arg2)) distinct-values($arg1[.=$arg2]) distinct-values($arg1[not(.=$arg2)])
the first query should be fast enough, but it may be beneficial to optimize the (non-)equality test in the next two queries. I’m not sure, though, if this optimization can be generalized that easily. ___________________________
2013/9/26 Arve Gengelbach ag@basex.org:
I had a typo in my example. The return did not call the map2 version. With it, the results were (now on 7.7 and Windows 7)
And I posted the wrong version. Usage of ! within map:new() is not as fast as a flowr expression (in general).
BUT neither of the “fastest two” functions yields valid results (e.g. take 2,2,3 for $a and empty sequence for $b) Arto, I hope replacing one flowr expression with the simple map operator can improve speed in your use-cases. So hopefully this is fastest for you:
declare function local:difference-maps($a, $b) { let $m1 := map:new(for $i in $a return map:entry($i, true())) let $m2 := map:new(for $i in $b return map:entry($i, false())) let $m3 := map:new(($m1, $m2)) return map:keys($m3) ! (if ($m3(.)) then . else ()) };
As the speed gain depends on the spreading of the data, rewriting set operations needs proove of efficency for any given data. _______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk