Hi Vladimir,
But if I search in "db:list...db:open..." - it takes about 12-15 seconds.
If the name of the database is not statically known, the query cannot be rewritten for index access (because one the targeted database may not have the required index). I guess you have the full-text index enabled?
However, since BaseX 9, you can take advantage of the ENFORCEINDEX option: All queries will then optimized for index operations, based on your knowledge that there an index will be available. See [1] for further details.
By the way, you can have a look at the compilation section of the Info View in the GUI to see if indexes will be applied in your query.
Best, Christian
[1] http://docs.basex.org/wiki/Indexes#Enforce_Rewritings
Example takes ~12-15s: let $db := for $i in db:list()[starts-with(.,'000999~')] return try {db:open($i)} catch * {} for $doc in $db/.//*[text() contains text { 'TEN-9258' } any] return $doc
Example takes ~180ms (returns 2 rows): let $db := for $i in db:list()[starts-with(.,'000999~201807')] return db:open($i) for $doc in $db/.//*[text() contains text { 'TEN-9258' } any] return $doc
Example takes ~10ms (returns 2 rows): for $doc in db:open('000999~201807')/.//*[text() contains text { 'TEN-9258' } any] return $doc
Why do the last 2 examples take different times? How can I improve this?
Example takes ~2s (returns 0 rows): let $db := for $i in db:list()[starts-with(.,'000999~201806')] return db:open($i) for $doc in $db/.//*[text() contains text { 'TEN-9258' } any] return $doc
Example takes ~12ms (returns 0 rows): for $doc in db:open('000999~201806')/.//*[text() contains text { 'TEN-9258' } any] return $doc
25.06.2018, 13:07, "Alexander Shpack" shadowkin@gmail.com:
Hi, Vladimir,
If you will do db names with the particular prefix, for example "db_", you may use the next code
let $docs := for $i in db:list()[starts-with(.,"db_")] return db:open($i)return $docs/*
On Mon, Jun 25, 2018 at 12:32 PM Ветошкин Владимир en-trance@yandex.ru wrote:
Hi, Alexander,
Some questions: After that, how can I perform a search in all of these databases? Can I search for substring without fulltext using only text index?
25.06.2018, 11:56, "Alexander Shpack" shadowkin@gmail.com:
Hey Vladimir,
You can use sharding approach for you data import and split all DBs even every month.
On Mon, Jun 25, 2018 at 11:50 AM Ветошкин Владимир en-trance@yandex.ru wrote:
Hi, Alexander! Thank you!
In my previous letter I have described the proccess in short. I'll think about separated DB. But I'm afraid that this base will also be very big in future. Although I can try to split data to several databases - one per year.. Hmm..
25.06.2018, 11:25, "Alexander Shpack" shadowkin@gmail.com:
Hey, Vladimir!
Just put this specific files to the separated DB and than index it. You can process it automatically, BaseX allows to create and index DB right from XQuery.
I hope it helps you. Anyhow, you can provide more details about your task and we can figure out the best solution for you.
On Mon, Jun 25, 2018 at 10:42 AM Ветошкин Владимир en-trance@yandex.ru wrote:
Hi, Fabrice! Thank you.
All databases constantly change.That is why there is no way to single out "a big readonly collection" :( Maybe it is possible to use some other incremental indexes? I have to index specific xml-files, not all files in database.
21.06.2018, 17:16, "Fabrice ETANCHAUD" fetanchaud@pch.cerfrance.fr:
Hi Vladimir,
I don’t think there is something like a incremental full text index for the moment [1].
As index is per collection, the recommanded way shall be to split your data in two collections :
A big readonly collection of all the past updates, indexed once
A small/medium sized collection whom full text index can be recreated in an acceptable time after each update.
At the end of a predefined time period, you have to add the live collection to the readonly one, reindex it, and truncate the live one.
Best regards from France,
Fabrice Etanchaud
[1] http://docs.basex.org/wiki/Indexes#Updates
De : BaseX-Talk [mailto:basex-talk-bounces@mailman.uni-konstanz.de] De la part de ???????? ???????? Envoyé : jeudi 21 juin 2018 16:02 À : BaseX Objet : [basex-talk] Full-Text
Hi, everyone!
Is there any way to index only imported xml-files?
Now, when I import xml-files the full-text index is deleted.
After importing I recreate whole full-text index and it takes too much time :(
--
С уважением,
Ветошкин Владимир Владимирович
-- С уважением, Ветошкин Владимир Владимирович
-- s0rr0w
-- С уважением, Ветошкин Владимир Владимирович
-- s0rr0w
-- С уважением, Ветошкин Владимир Владимирович
-- s0rr0w
-- С уважением, Ветошкин Владимир Владимирович