2011/2/27 Christian Grün <christian.gruen@gmail.com>

Hi Phil,

> declare default element namespace
> "http://www.mediawiki.org/xml/export-0.4/";
> //siteinfo

If you know that this node will occur only once, the most efficient
option will be to use a positional predicate:

( //*:siteinfo ) [1]

But you may be surprised that the following query is evaluated very quickly:

count(//*:siteinfo)

This means that the path index has indeed enough information to allow
for a faster evaluation: we're not saving direct references to the
target nodes (as such an index would get very large for e.g. the
Wikipedia page element), but we're saving the number of distinct node
paths. As a result, we could rewrite your query into

( //*:siteinfo ) [position() <= 1]

We haven't included this optimization yet, as the additional predicate
may slow down other queries; but in your case, it would clearly speed
up the evaluation time to a few milliseconds (if at all). I have added
a GitHub issue to remember your thoughts:

https://github.com/BaseXdb/basex/issues#issue/29

> While personally I very much dislike namespaces, they are common,
> and they have to be efficiently handled.

Namespaces, a great topic... It's true that name tests with prefixes
will be evaluated slower than queries without prefixes (i.e., prefix
wildcards). This is something most XQuery implementations suffer from,
as the complex nature of namespaces does not enable simple reference
checks. Indeed, most members of the W3 XML Query Working Group regret
that namespaces have not been specified much simpler; due to all
legacy issues, history cannot be reverted in that aspect.

After all, however, I was surprised to see that your query nearly took
twice the time as the one without namespaces; I'd have expected a
slowdown of maybe 10-15%. To conclude this: if you want faster
queries, you should declare global namespaces, or simply use
wildcards.

Hope this helps,
Christian
_______________________________________________
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

Ing. Jan Vlčinský
CAD programy
Slunečnicová 338/3, 734 01 Karviná Ráj, Czech Republic
tel: +420-597 602 024; mob: +420-608 979 040
skype: janvlcinsky; GoogleTalk: jan.vlcinsky@gmail.com
http://cz.linkedin.com/in/vlcinsky