I am using BaseX 8.6.4 and I am trying to do a group-by/order-by operation, and I see that two logically equivalent queries perform very differently: one cannot see the end, while the other can (and fast). I can provide further details if necessary, but these are the queries (look at the last line of both, where the difference is):
(This does not work:)
declare variable $p := db:open("db2")/text/line/text(); (: returns a list of values like ΠΕΡΙ#n-s---mv- :)
for $u in db:open("db1")/text/s/t let $a := $u/f/text() (: returns one value like ΠΕΡΙ :) let $b := $u/@o/data(.) (: returns one value like n-s---mv- :) group by $k := $a || "#" || $b (: builds a value like ΠΕΡΙ#n-s---mv- :) let $n := count($u) order by $n descending return if ($p = $k) then $k || " " || $n else ()
(This works:)
for $u in db:open("db1")/text/s/t let $a := $u/f/text() (: returns one value like ΠΕΡΙ :) let $b := $u/@o/data(.) (: returns one value like n-s---mv- :) group by $k := $a || "#" || $b (: builds a value like ΠΕΡΙ#n-s---mv- :) let $n := count($u) order by $n descending return if (db:text("db2", $k)) then $k || " " || $n else ()
The problem is with directly accessing the database with db:text() or indirectly using XPATH. I would tend to use XPATH and expect to get an underlying translation into db:text(), but this seems not to happen: why?
Best, Giuseppe
Bonjour Giuseppe,
My first guess is that count($p) is a very very huge number ?
In the first query you will scan the $p sequence at each iteration. In the second one you will directly access the text index.
Hoping it helps
Best regards, Fabrice
-----Message d'origine----- De : basex-talk-bounces@mailman.uni-konstanz.de [mailto:basex-talk-bounces@mailman.uni-konstanz.de] De la part de Giuseppe Celano Envoyé : mardi 19 septembre 2017 15:35 À : BaseX Objet : [basex-talk] db:text() vs XPath
I am using BaseX 8.6.4 and I am trying to do a group-by/order-by operation, and I see that two logically equivalent queries perform very differently: one cannot see the end, while the other can (and fast). I can provide further details if necessary, but these are the queries (look at the last line of both, where the difference is):
(This does not work:)
declare variable $p := db:open("db2")/text/line/text(); (: returns a list of values like ΠΕΡΙ#n-s---mv- :)
for $u in db:open("db1")/text/s/t let $a := $u/f/text() (: returns one value like ΠΕΡΙ :) let $b := $u/@o/data(.) (: returns one value like n-s---mv- :) group by $k := $a || "#" || $b (: builds a value like ΠΕΡΙ#n-s---mv- :) let $n := count($u) order by $n descending return if ($p = $k) then $k || " " || $n else ()
(This works:)
for $u in db:open("db1")/text/s/t let $a := $u/f/text() (: returns one value like ΠΕΡΙ :) let $b := $u/@o/data(.) (: returns one value like n-s---mv- :) group by $k := $a || "#" || $b (: builds a value like ΠΕΡΙ#n-s---mv- :) let $n := count($u) order by $n descending return if (db:text("db2", $k)) then $k || " " || $n else ()
The problem is with directly accessing the database with db:text() or indirectly using XPATH. I would tend to use XPATH and expect to get an underlying translation into db:text(), but this seems not to happen: why?
Best, Giuseppe
Hi!
My guess is: the same as the problems I described: The optimizer is unable to handle anything like such variables or functions. If you pass your db:open() around or get it from a declared variable the optimizer has no clue what to do and falls back to the slow default bahavoir. Workarount: use xquery:eval().
Best regards,
Omar
Am 19.09.2017 um 15:47 schrieb Fabrice ETANCHAUD:
Bonjour Giuseppe,
My first guess is that count($p) is a very very huge number ?
In the first query you will scan the $p sequence at each iteration. In the second one you will directly access the text index.
Hoping it helps
Best regards, Fabrice
-----Message d'origine----- De : basex-talk-bounces@mailman.uni-konstanz.de [mailto:basex-talk-bounces@mailman.uni-konstanz.de] De la part de Giuseppe Celano Envoyé : mardi 19 septembre 2017 15:35 À : BaseX Objet : [basex-talk] db:text() vs XPath
I am using BaseX 8.6.4 and I am trying to do a group-by/order-by operation, and I see that two logically equivalent queries perform very differently: one cannot see the end, while the other can (and fast). I can provide further details if necessary, but these are the queries (look at the last line of both, where the difference is):
(This does not work:)
declare variable $p := db:open("db2")/text/line/text(); (: returns a list of values like ΠΕΡΙ#n-s---mv- :)
for $u in db:open("db1")/text/s/t let $a := $u/f/text() (: returns one value like ΠΕΡΙ :) let $b := $u/@o/data(.) (: returns one value like n-s---mv- :) group by $k := $a || "#" || $b (: builds a value like ΠΕΡΙ#n-s---mv- :) let $n := count($u) order by $n descending return if ($p = $k) then $k || " " || $n else ()
(This works:)
for $u in db:open("db1")/text/s/t let $a := $u/f/text() (: returns one value like ΠΕΡΙ :) let $b := $u/@o/data(.) (: returns one value like n-s---mv- :) group by $k := $a || "#" || $b (: builds a value like ΠΕΡΙ#n-s---mv- :) let $n := count($u) order by $n descending return if (db:text("db2", $k)) then $k || " " || $n else ()
The problem is with directly accessing the database with db:text() or indirectly using XPATH. I would tend to use XPATH and expect to get an underlying translation into db:text(), but this seems not to happen: why?
Best, Giuseppe
Hi Giuseppe,
The following query may be faster:
for $u in db:open("db1")/text/s/t group by $k := $u/f || "#" || $u/@o where db:open("db2")/text/line[text() = $k] let $n := count($u) order by $n descending return $k || " " || $n
Cheers, Christian
On Tue, Sep 19, 2017 at 3:35 PM, Giuseppe Celano celano@informatik.uni-leipzig.de wrote:
I am using BaseX 8.6.4 and I am trying to do a group-by/order-by operation, and I see that two logically equivalent queries perform very differently: one cannot see the end, while the other can (and fast). I can provide further details if necessary, but these are the queries (look at the last line of both, where the difference is):
(This does not work:)
declare variable $p := db:open("db2")/text/line/text(); (: returns a list of values like ΠΕΡΙ#n-s---mv- :)
for $u in db:open("db1")/text/s/t let $a := $u/f/text() (: returns one value like ΠΕΡΙ :) let $b := $u/@o/data(.) (: returns one value like n-s---mv- :) group by $k := $a || "#" || $b (: builds a value like ΠΕΡΙ#n-s---mv- :) let $n := count($u) order by $n descending return if ($p = $k) then $k || " " || $n else ()
(This works:)
for $u in db:open("db1")/text/s/t let $a := $u/f/text() (: returns one value like ΠΕΡΙ :) let $b := $u/@o/data(.) (: returns one value like n-s---mv- :) group by $k := $a || "#" || $b (: builds a value like ΠΕΡΙ#n-s---mv- :) let $n := count($u) order by $n descending return if (db:text("db2", $k)) then $k || " " || $n else ()
The problem is with directly accessing the database with db:text() or indirectly using XPATH. I would tend to use XPATH and expect to get an underlying translation into db:text(), but this seems not to happen: why?
Best, Giuseppe
Hi Christian,
It works only if I substitute your where clause with
where db:text("db2", $k)
Ciao, Giuseppe
On Sep 19, 2017, at 4:15 PM, Christian Grün christian.gruen@gmail.com wrote:
where db:open("db2")/text/line[text() = $k]
Ciao Giuseppe,
Interesting indeed. I have improved the static typing of group by variables; it should work with the new snapshot [1]. If you want to stick with the current release, you can wrap the key with a string() function:
for $u in db:open("db1")/text/s/t group by $k := $u/f || "#" || $u/@o where db:open("db2")/text/line[text() = string($k)] let $n := count($u) order by $n descending return $k || " " || $n
Spero che questo aiuta, Christian
[1] http://files.basex.org/releases/latest/
On Tue, Sep 19, 2017 at 5:10 PM, Giuseppe Celano celano@informatik.uni-leipzig.de wrote:
Hi Christian,
It works only if I substitute your where clause with
where db:text("db2", $k)
Ciao, Giuseppe
On Sep 19, 2017, at 4:15 PM, Christian Grün christian.gruen@gmail.com wrote:
where db:open("db2")/text/line[text() = $k]
Yes, this works!
Thanks, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
On Sep 19, 2017, at 5:48 PM, Christian Grün christian.gruen@gmail.com wrote:
for $u in db:open("db1")/text/s/t group by $k := $u/f || "#" || $u/@o where db:open("db2")/text/line[text() = string($k)] let $n := count($u) order by $n descending return $k || " " || $n
basex-talk@mailman.uni-konstanz.de