Hi Dimitar,
handler/specification is a node text, not an attribute. That is the reason that I used fulltext search just for $hd/specification.
I don't understand why full text index is not used here.
Greetings, An
On Sun, Nov 27, 2011 at 8:49 PM, Dimitar Popov < Dimitar.Popov@uni-konstanz.de> wrote:
**
Hi An,
thank you for the provided data and sample query. Please, check my comments, below.
Am Sonntag, 27. November 2011, 17:33:00 schrieb Truong An Nguyen:
declare default element namespace "http://iso.org/OTX";
for $pro in collection()/otx/procedures/procedure
return for $hd in $pro/realisation/flow//handler
where exists($hd/@*[contains(data(.),"Variable1")])
or
exists($hd/realisation/catch/exception//@*[contains(data(.),"Variable1")])
or $hd/specification contains text "Specification"
(: or exists ($hd/specification[contains(data(.),"Specification")] ):)
return
concat(data($pro/../../@package),":",data($pro/../../@name),":",data($pro/@n
ame),":","handler",":",$hd/@id)
The variant with "contains text" ran much slower than the variant with
"contains".
Hm, on my computer the difference is not huge (1307.42 ms for fn:contains() vs. 1446.64 ms for "contains text"), but, yes, "slow" is a relative term :)
Anyway, the difference is due to the fact, that while fn:contains() does simple sub-string search, "contains text" offers more advanced options such as case insensitivity, stemming, stop words, etc. Thus, when the full-text index is not used, there is some more processing of both the query string as well as the matched string, which results the slower performance.
The indexes are used: path, text index, attribute index, full-text index
(without any options)
With the provided query, the full-text index is not used. The reason for this, is that BaseX does not index the string values of attributes, i.e. only text nodes are indexed.
I don't know what the query should do, but please note the different behavior of fn:contains() and contains text. Just a quick example:
fn:contains('GlobalDocumentVariable1_String', 'Variable1') -> true
'GlobalDocumentVariable1_String' contains text 'Variable1' -> false
Further, one small optimization would be to remove the data() function call in the predicates, i.e.
$hd/realisation/catch/exception//@*[contains(.,"Variable1")]
is enough.
I hope this helps.
Greetings,
Dimitar