Performance issue due to evaluation during function composition ?
Hello (and happy new year 2015 !) I am facing some performance issues opening "big" databases due to an (unexpected for me) argument evaluation during function composition. Is this a normal behavior ? Cheers Here is a code description : declare variable $db := db:open("MyBigDataBase"); declare function local:elts($db){$db//element()}; declare function local:compose($f1 as function(*), $f2 as function(*)) as function(*){ function($a) { $f1($f2($a)) }}; declare function local:count_elt($db) {count(local:elts($db))}; prof:time(local:count_elt($db)), prof:time(count(local:elts($db))), prof:time(local:compose(count#1,local:elts#1) ($db)) ouput : 3.1 ms 0.02 ms 100047.53 ms 134666491 134666491 134666491
Hello Jean-Marc, what version of BaseX did you use? I tried with the latest snapshot and used the XMark dataset (with factor 80, which produces around the same number of elements (133653910 to be exact) you had) and got the following output with an up-to-date index: - 0.01 ms - 0.01 ms - 0.0 ms If it isn't a bug in a previous version of BaseX, I would guess you reached the inline limit (http://docs.basex.org/wiki/Options#INLINELIMIT). You might want to try setting it to a higher value. Cheers, Dirk On 01/02/2015 10:19 AM, jean-marc Mercier wrote:
Hello (and happy new year 2015 !)
I am facing some performance issues opening "big" databases due to an (unexpected for me) argument evaluation during function composition. Is this a normal behavior ?
Cheers
Here is a code description :
declare variable $db := db:open("MyBigDataBase"); declare function local:elts($db){$db//element()}; declare function local:compose($f1 as function(*), $f2 as function(*)) as function(*){ function($a) { $f1($f2($a)) }};
declare function local:count_elt($db) {count(local:elts($db))}; prof:time(local:count_elt($db)), prof:time(count(local:elts($db))), prof:time(local:compose(count#1,local:elts#1) ($db))
ouput : 3.1 ms 0.02 ms 100047.53 ms 134666491 134666491 134666491
-- Dirk Kirsten, BaseX GmbH, http://basex.org |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22
Hi Dirk, You are right, I was using an old 8.0 BaseX version (seems to be the one from 23/09/2014). The problem does not appear in the latest release. Thank you ! 2015-01-02 19:29 GMT+01:00 Dirk Kirsten <dk@basex.org>:
Hello Jean-Marc,
what version of BaseX did you use? I tried with the latest snapshot and used the XMark dataset (with factor 80, which produces around the same number of elements (133653910 to be exact) you had) and got the following output with an up-to-date index:
- 0.01 ms - 0.01 ms - 0.0 ms
If it isn't a bug in a previous version of BaseX, I would guess you reached the inline limit (http://docs.basex.org/wiki/Options#INLINELIMIT). You might want to try setting it to a higher value.
Cheers, Dirk
On 01/02/2015 10:19 AM, jean-marc Mercier wrote:
Hello (and happy new year 2015 !)
I am facing some performance issues opening "big" databases due to an (unexpected for me) argument evaluation during function composition. Is this a normal behavior ?
Cheers
Here is a code description :
declare variable $db := db:open("MyBigDataBase"); declare function local:elts($db){$db//element()}; declare function local:compose($f1 as function(*), $f2 as function(*)) as function(*){ function($a) { $f1($f2($a)) }};
declare function local:count_elt($db) {count(local:elts($db))}; prof:time(local:count_elt($db)), prof:time(count(local:elts($db))), prof:time(local:compose(count#1,local:elts#1) ($db))
ouput : 3.1 ms 0.02 ms 100047.53 ms 134666491 134666491 134666491
-- Dirk Kirsten, BaseX GmbH, http://basex.org |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22
Dirk, Hello, sorry, but a very similar behavior seems to appear also in the latest version BaseX80-20141225.174535. Here is a new code to test : declare variable $BigDb := AppParam:DBInputOpen($ConfigDoc); declare function local:dummy($db) {local:call(local:count_elements#1,$db)}; declare function local:count_elements($db) {count($db/descendant-or-self::element())}; declare function local:call($fun,$args) {$fun($args)}; prof:time(local:call(local:count_elements#1,$BigDb)) ,prof:time(local:dummy($BigDb)) output : 3.05 ms 104170.17 ms 134666491 134666491 2015-01-02 20:05 GMT+01:00 jean-marc Mercier <jeanmarc.mercier@gmail.com>:
Hi Dirk,
You are right, I was using an old 8.0 BaseX version (seems to be the one from 23/09/2014). The problem does not appear in the latest release. Thank you !
2015-01-02 19:29 GMT+01:00 Dirk Kirsten <dk@basex.org>:
Hello Jean-Marc,
what version of BaseX did you use? I tried with the latest snapshot and used the XMark dataset (with factor 80, which produces around the same number of elements (133653910 to be exact) you had) and got the following output with an up-to-date index:
- 0.01 ms - 0.01 ms - 0.0 ms
If it isn't a bug in a previous version of BaseX, I would guess you reached the inline limit (http://docs.basex.org/wiki/Options#INLINELIMIT). You might want to try setting it to a higher value.
Cheers, Dirk
On 01/02/2015 10:19 AM, jean-marc Mercier wrote:
Hello (and happy new year 2015 !)
I am facing some performance issues opening "big" databases due to an (unexpected for me) argument evaluation during function composition. Is this a normal behavior ?
Cheers
Here is a code description :
declare variable $db := db:open("MyBigDataBase"); declare function local:elts($db){$db//element()}; declare function local:compose($f1 as function(*), $f2 as function(*)) as function(*){ function($a) { $f1($f2($a)) }};
declare function local:count_elt($db) {count(local:elts($db))}; prof:time(local:count_elt($db)), prof:time(count(local:elts($db))), prof:time(local:compose(count#1,local:elts#1) ($db))
ouput : 3.1 ms 0.02 ms 100047.53 ms 134666491 134666491 134666491
-- Dirk Kirsten, BaseX GmbH, http://basex.org |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22
errata : * declare variable $BigDb := db:open('MyBigDataBase'); 2015-01-02 21:54 GMT+01:00 jean-marc Mercier <jeanmarc.mercier@gmail.com>:
Dirk,
Hello, sorry, but a very similar behavior seems to appear also in the latest version BaseX80-20141225.174535. Here is a new code to test :
declare variable $BigDb := AppParam:DBInputOpen($ConfigDoc); declare function local:dummy($db) {local:call(local:count_elements#1,$db)}; declare function local:count_elements($db) {count($db/descendant-or-self::element())}; declare function local:call($fun,$args) {$fun($args)};
prof:time(local:call(local:count_elements#1,$BigDb)) ,prof:time(local:dummy($BigDb))
output : 3.05 ms 104170.17 ms 134666491 134666491
2015-01-02 20:05 GMT+01:00 jean-marc Mercier <jeanmarc.mercier@gmail.com>:
Hi Dirk,
You are right, I was using an old 8.0 BaseX version (seems to be the one from 23/09/2014). The problem does not appear in the latest release. Thank you !
2015-01-02 19:29 GMT+01:00 Dirk Kirsten <dk@basex.org>:
Hello Jean-Marc,
what version of BaseX did you use? I tried with the latest snapshot and used the XMark dataset (with factor 80, which produces around the same number of elements (133653910 to be exact) you had) and got the following output with an up-to-date index:
- 0.01 ms - 0.01 ms - 0.0 ms
If it isn't a bug in a previous version of BaseX, I would guess you reached the inline limit (http://docs.basex.org/wiki/Options#INLINELIMIT). You might want to try setting it to a higher value.
Cheers, Dirk
On 01/02/2015 10:19 AM, jean-marc Mercier wrote:
Hello (and happy new year 2015 !)
I am facing some performance issues opening "big" databases due to an (unexpected for me) argument evaluation during function composition. Is this a normal behavior ?
Cheers
Here is a code description :
declare variable $db := db:open("MyBigDataBase"); declare function local:elts($db){$db//element()}; declare function local:compose($f1 as function(*), $f2 as function(*)) as function(*){ function($a) { $f1($f2($a)) }};
declare function local:count_elt($db) {count(local:elts($db))}; prof:time(local:count_elt($db)), prof:time(count(local:elts($db))), prof:time(local:compose(count#1,local:elts#1) ($db))
ouput : 3.1 ms 0.02 ms 100047.53 ms 134666491 134666491 134666491
-- Dirk Kirsten, BaseX GmbH, http://basex.org |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22
Hello Jean-Marc, yeah, I can reproduce that. However, I would not consider it a bug but rather a not-so-clever optimizer. The optimized query looks something like this: declare function local:dummy($db_0) { let $fun_8 := local:count_elements#1 return $fun_8($db_0) }; (prof:time(22383), prof:time(local:dummy(db:open-pre("my-db",0)))) It quite nicely shows that in the second case the function isn't inlined but instead it is fully evaluated (and counting all elements will take some time instead of just looking up the value from the index). So yes, I agree the optimizer could (and probably should) do better, but right now it is not strictly incorrect (but unexpected). I guess Christian will fix this as soon as he is back from Christmas vacation. Cheers, Dirk On 01/02/2015 10:02 PM, jean-marc Mercier wrote:
errata : * declare variable $BigDb := db:open('MyBigDataBase');
2015-01-02 21:54 GMT+01:00 jean-marc Mercier <jeanmarc.mercier@gmail.com>:
Dirk,
Hello, sorry, but a very similar behavior seems to appear also in the latest version BaseX80-20141225.174535. Here is a new code to test :
declare variable $BigDb := AppParam:DBInputOpen($ConfigDoc); declare function local:dummy($db) {local:call(local:count_elements#1,$db)}; declare function local:count_elements($db) {count($db/descendant-or-self::element())}; declare function local:call($fun,$args) {$fun($args)};
prof:time(local:call(local:count_elements#1,$BigDb)) ,prof:time(local:dummy($BigDb))
output : 3.05 ms 104170.17 ms 134666491 134666491
2015-01-02 20:05 GMT+01:00 jean-marc Mercier <jeanmarc.mercier@gmail.com>:
Hi Dirk,
You are right, I was using an old 8.0 BaseX version (seems to be the one from 23/09/2014). The problem does not appear in the latest release. Thank you !
2015-01-02 19:29 GMT+01:00 Dirk Kirsten <dk@basex.org>:
Hello Jean-Marc,
what version of BaseX did you use? I tried with the latest snapshot and used the XMark dataset (with factor 80, which produces around the same number of elements (133653910 to be exact) you had) and got the following output with an up-to-date index:
- 0.01 ms - 0.01 ms - 0.0 ms
If it isn't a bug in a previous version of BaseX, I would guess you reached the inline limit (http://docs.basex.org/wiki/Options#INLINELIMIT). You might want to try setting it to a higher value.
Cheers, Dirk
On 01/02/2015 10:19 AM, jean-marc Mercier wrote:
Hello (and happy new year 2015 !)
I am facing some performance issues opening "big" databases due to an (unexpected for me) argument evaluation during function composition. Is this a normal behavior ?
Cheers
Here is a code description :
declare variable $db := db:open("MyBigDataBase"); declare function local:elts($db){$db//element()}; declare function local:compose($f1 as function(*), $f2 as function(*)) as function(*){ function($a) { $f1($f2($a)) }};
declare function local:count_elt($db) {count(local:elts($db))}; prof:time(local:count_elt($db)), prof:time(count(local:elts($db))), prof:time(local:compose(count#1,local:elts#1) ($db))
ouput : 3.1 ms 0.02 ms 100047.53 ms 134666491 134666491 134666491
-- Dirk Kirsten, BaseX GmbH, http://basex.org |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22
-- Dirk Kirsten, BaseX GmbH, http://basex.org |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22
Hi Jean-Marc, Dirk has already outlined well what this is about. I have added a new GitHub request [1]. Cheers, Christian [1] https://github.com/BaseXdb/basex/issues/1052 On Fri, Jan 2, 2015 at 11:30 PM, Dirk Kirsten <dk@basex.org> wrote:
Hello Jean-Marc,
yeah, I can reproduce that. However, I would not consider it a bug but rather a not-so-clever optimizer. The optimized query looks something like this:
declare function local:dummy($db_0) { let $fun_8 := local:count_elements#1 return $fun_8($db_0) }; (prof:time(22383), prof:time(local:dummy(db:open-pre("my-db",0))))
It quite nicely shows that in the second case the function isn't inlined but instead it is fully evaluated (and counting all elements will take some time instead of just looking up the value from the index). So yes, I agree the optimizer could (and probably should) do better, but right now it is not strictly incorrect (but unexpected). I guess Christian will fix this as soon as he is back from Christmas vacation.
Cheers, Dirk
On 01/02/2015 10:02 PM, jean-marc Mercier wrote:
errata : * declare variable $BigDb := db:open('MyBigDataBase');
2015-01-02 21:54 GMT+01:00 jean-marc Mercier <jeanmarc.mercier@gmail.com>:
Dirk,
Hello, sorry, but a very similar behavior seems to appear also in the latest version BaseX80-20141225.174535. Here is a new code to test :
declare variable $BigDb := AppParam:DBInputOpen($ConfigDoc); declare function local:dummy($db) {local:call(local:count_elements#1,$db)}; declare function local:count_elements($db) {count($db/descendant-or-self::element())}; declare function local:call($fun,$args) {$fun($args)};
prof:time(local:call(local:count_elements#1,$BigDb)) ,prof:time(local:dummy($BigDb))
output : 3.05 ms 104170.17 ms 134666491 134666491
2015-01-02 20:05 GMT+01:00 jean-marc Mercier <jeanmarc.mercier@gmail.com>:
Hi Dirk,
You are right, I was using an old 8.0 BaseX version (seems to be the one from 23/09/2014). The problem does not appear in the latest release. Thank you !
2015-01-02 19:29 GMT+01:00 Dirk Kirsten <dk@basex.org>:
Hello Jean-Marc,
what version of BaseX did you use? I tried with the latest snapshot and used the XMark dataset (with factor 80, which produces around the same number of elements (133653910 to be exact) you had) and got the following output with an up-to-date index:
- 0.01 ms - 0.01 ms - 0.0 ms
If it isn't a bug in a previous version of BaseX, I would guess you reached the inline limit (http://docs.basex.org/wiki/Options#INLINELIMIT). You might want to try setting it to a higher value.
Cheers, Dirk
On 01/02/2015 10:19 AM, jean-marc Mercier wrote:
Hello (and happy new year 2015 !)
I am facing some performance issues opening "big" databases due to an (unexpected for me) argument evaluation during function composition. Is this a normal behavior ?
Cheers
Here is a code description :
declare variable $db := db:open("MyBigDataBase"); declare function local:elts($db){$db//element()}; declare function local:compose($f1 as function(*), $f2 as function(*)) as function(*){ function($a) { $f1($f2($a)) }};
declare function local:count_elt($db) {count(local:elts($db))}; prof:time(local:count_elt($db)), prof:time(count(local:elts($db))), prof:time(local:compose(count#1,local:elts#1) ($db))
ouput : 3.1 ms 0.02 ms 100047.53 ms 134666491 134666491 134666491
-- Dirk Kirsten, BaseX GmbH, http://basex.org |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22
-- Dirk Kirsten, BaseX GmbH, http://basex.org |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22
Hi Christian, Thx a lot. Jean-Marc 2015-01-06 12:20 GMT+01:00 Christian Grün <christian.gruen@gmail.com>:
Hi Jean-Marc,
Dirk has already outlined well what this is about. I have added a new GitHub request [1].
Cheers, Christian
[1] https://github.com/BaseXdb/basex/issues/1052
On Fri, Jan 2, 2015 at 11:30 PM, Dirk Kirsten <dk@basex.org> wrote:
Hello Jean-Marc,
yeah, I can reproduce that. However, I would not consider it a bug but rather a not-so-clever optimizer. The optimized query looks something like this:
declare function local:dummy($db_0) { let $fun_8 := local:count_elements#1 return $fun_8($db_0) }; (prof:time(22383), prof:time(local:dummy(db:open-pre("my-db",0))))
It quite nicely shows that in the second case the function isn't inlined but instead it is fully evaluated (and counting all elements will take some time instead of just looking up the value from the index). So yes, I agree the optimizer could (and probably should) do better, but right now it is not strictly incorrect (but unexpected). I guess Christian will fix this as soon as he is back from Christmas vacation.
Cheers, Dirk
On 01/02/2015 10:02 PM, jean-marc Mercier wrote:
errata : * declare variable $BigDb := db:open('MyBigDataBase');
2015-01-02 21:54 GMT+01:00 jean-marc Mercier < jeanmarc.mercier@gmail.com>:
Dirk,
Hello, sorry, but a very similar behavior seems to appear also in the latest version BaseX80-20141225.174535. Here is a new code to test :
declare variable $BigDb := AppParam:DBInputOpen($ConfigDoc); declare function local:dummy($db) {local:call(local:count_elements#1,$db)}; declare function local:count_elements($db) {count($db/descendant-or-self::element())}; declare function local:call($fun,$args) {$fun($args)};
prof:time(local:call(local:count_elements#1,$BigDb)) ,prof:time(local:dummy($BigDb))
output : 3.05 ms 104170.17 ms 134666491 134666491
2015-01-02 20:05 GMT+01:00 jean-marc Mercier < jeanmarc.mercier@gmail.com>:
Hi Dirk,
You are right, I was using an old 8.0 BaseX version (seems to be the one from 23/09/2014). The problem does not appear in the latest release. Thank you !
2015-01-02 19:29 GMT+01:00 Dirk Kirsten <dk@basex.org>:
Hello Jean-Marc,
what version of BaseX did you use? I tried with the latest snapshot and used the XMark dataset (with factor 80, which produces around the same number of elements (133653910 to be exact) you had) and got the following output with an up-to-date index:
- 0.01 ms - 0.01 ms - 0.0 ms
If it isn't a bug in a previous version of BaseX, I would guess you reached the inline limit ( http://docs.basex.org/wiki/Options#INLINELIMIT). You might want to try setting it to a higher value.
Cheers, Dirk
On 01/02/2015 10:19 AM, jean-marc Mercier wrote: > Hello (and happy new year 2015 !) > > I am facing some performance issues opening "big" databases due to an > (unexpected for me) argument evaluation during function composition. Is > this a normal behavior ? > > Cheers > > Here is a code description : > > declare variable $db := db:open("MyBigDataBase"); > declare function local:elts($db){$db//element()}; > declare function local:compose($f1 as function(*), $f2 as function(*)) as > function(*){ function($a) { $f1($f2($a)) }}; > > declare function local:count_elt($db) {count(local:elts($db))}; > prof:time(local:count_elt($db)), > prof:time(count(local:elts($db))), > prof:time(local:compose(count#1,local:elts#1) ($db)) > > ouput : > 3.1 ms > 0.02 ms > 100047.53 ms > 134666491 134666491 134666491 >
-- Dirk Kirsten, BaseX GmbH, http://basex.org |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22
-- Dirk Kirsten, BaseX GmbH, http://basex.org |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22
Hi Jean-Marc, It's not that trivial to optimize your query, but I'd like to let you know that pre-evaluation of function literals works if function declarations are swapped [1]. Christian [1] https://github.com/BaseXdb/basex/issues/1052 On Fri, Jan 2, 2015 at 10:19 AM, jean-marc Mercier <jeanmarc.mercier@gmail.com> wrote:
Hello (and happy new year 2015 !)
I am facing some performance issues opening "big" databases due to an (unexpected for me) argument evaluation during function composition. Is this a normal behavior ?
Cheers
Here is a code description :
declare variable $db := db:open("MyBigDataBase"); declare function local:elts($db){$db//element()}; declare function local:compose($f1 as function(*), $f2 as function(*)) as function(*){ function($a) { $f1($f2($a)) }};
declare function local:count_elt($db) {count(local:elts($db))}; prof:time(local:count_elt($db)), prof:time(count(local:elts($db))), prof:time(local:compose(count#1,local:elts#1) ($db))
ouput : 3.1 ms 0.02 ms 100047.53 ms 134666491 134666491 134666491
participants (3)
-
Christian Grün -
Dirk Kirsten -
jean-marc Mercier