Hi Gioele,
I can confirm that the following axis is pretty expensive in BaseX, as we do not store explicit sibling references. The preceding axis is cheaper as we can stop search as soon as we traverse over the node we started from.
One way out is to first access the following nodes in your document and move the preceding node check in a predicate.
Also, I also get a warning about «'following::*[(self::tei:entry or self::tei:re)][(fn:position() <= 3)]' will never yield results.» but that is obviously false, as it yields exactly the 3 results I expect.
That's surprising indeed. Yes, feel free to send me your XML document in private.
Hope this helps, Christian
On Thu, Feb 5, 2015 at 2:12 PM, Gioele Barabucci gioele@svario.it wrote:
Hello,
I have noticed that this query using the "following" axes
//*[@xml:id = "lemma-aMSa"] /following::*[self::tei:entry or self::tei:re] [position() <= 3]
is much slower than the same query with the "preceding" axes
//*[@xml:id = "lemma-aMSa"] /preceding::*[self::tei:entry or self::tei:re] [position() <= 3]
The query that uses "preceding" takes about 2.5 ms to execute, while the one using "following" takes about 250 ms: it is 100 times slower.
Why this discrepancy between these two queries?
I can provide the base XML file (19MB) on request.
Also, I also get a warning about «'following::*[(self::tei:entry or self::tei:re)][(fn:position() <= 3)]' will never yield results.» but that is obviously false, as it yields exactly the 3 results I expect.
Regards,
-- Gioele Barabucci gioele@svario.it