Hi Christian, I wrote a query for a customer who wants to analyze their legacy ISO 12083 math formulas, in this case for detecting multiple subsequent <roman> elements of length >= 3 with only whitespace in between. This is a synthetic test document: <doc> <p> <formula><roman>tan</roman> <roman>tan</roman></formula> </p> <p> <formula><roman>sin</roman> <sup>2</sup> <roman>sin</roman></formula> </p> <p> <formula><roman>cos</roman><sup>3</sup> <roman>cos</roman></formula> </p> </doc> And this is the query I wrote: let $rms := //(formula | dformula)//roman[string-length() gt 2] [ following-sibling::node()[1]/self::text()[not(normalize-space())] ] [ following-sibling::*[1]/self::roman[string-length() gt 2] ]/.., $docs := for $rm-context in $rms let $path := db:path($rm-context) group by $path return <doc path="{$path}">{ $rm-context }</doc> return <result count="{count($rms)}" docs="{count($docs)}">{ $docs }</result> BaseX (up to version 9.6.3) erroneously reports all three <formula> elements as a result, while only the first should be reported. This can be remedied by using parentheses, as in (following-sibling::node())[1]/self::text() and (following-sibling::*)[1]/self::roman. But this is inefficient, and the original query should just work™. In the optimized original query there is following-sibling::text()[fn:position() = 1] and following-sibling::roman[fn:position() = 1]. These are incorrect optimizations of following-sibling::node()[1]/self::text() and following-sibling::*[1]/self::roman. Gerrit -- Gerrit Imsieke Geschäftsführer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@le-tex.de, http://www.le-tex.de Registergericht / Commercial Register: Amtsgericht Leipzig Registernummer / Registration Number: HRB 24930 Geschäftsführer / Managing Directors: Gerrit Imsieke, Svea Jelonek, Thomas Schmidt