Hi Christian,
thanks for pointing to "ft:search", that's much easier to understand for me than using the enforceindex pragma (which yielded 0 matches, btw).
I'm ending with something like declare variable $b := 'Konstanz'; for $c in ('Korpus01', 'Korpus02') for $t in ft:search($c, $b)/parent::* return <p>{ ft:extract($t[./text() contains text {$b}]/text(), 'b', 155) }</p>
Searching multiple databases in parallel - 19000 hits in 840ms - very nice!
Thanks again for your patient help Matthias
Am Donnerstag, 10. September 2020, 08:30:37 CEST schrieb Christian Grün:
Hi Matthias,
since I "definitely should" build a BaseX database from millions of TEI-XML files, I did so!
Glad to hear!
I modified the XQuery: ... gives results, but lasts orders of magnitude longer than for just one database:
If a query is run on a single database, this database will be opened at compile-time, and available indexes will be checked. If the full-text index exists, your query will be rewritten to take advantage of the index structure.
If multiple databases are accessed in an iteration, you can e.g. give the query optimizer a hint that all databases will have up-to-date index structures. This can be done with the “enforceindex” pragma [1]:
declare variable $b := 'Konstanz'; for $c in ('Korpus01', 'Korpus02') for $t in (# db:enforceindex #) { db:open($c)//*[./text() contains text {$b}] } return <p>{ ft:extract($t[./text() contains text {$b}]/text(), 'b', 155) }</p>
If you use the BaseX GUI, you can open the Info View and check the output. If it outputs “apply full-text index”, you’ll know that the index is utilized. In the Info View, you’ll also see the optimized query string. It will give you some hints which other optimizations were applied to your input query. If full-text queries get more complex, it’s sometimes more convenient to directly use ft:search, as this function allows you to specify variable arguments, e.g. for wildcard or fuzzy searches.
Hope this helps, Christian
[1] https://docs.basex.org/wiki/Indexes#Enforce_Rewritings [2] https://docs.basex.org/wiki/Full-Text_Module#ft:search