Hello Fabrice
Yes, I am having a query which jumped from ~1500 ms to about a minute with a tiny little change...
The DB is about 2GB and it is my test set before putting the query to work on the full dataset.
The change was to go from simply returning the nodes themselves with a `return thisnode | thatnode |theothernode` to a "formatted" document that has an outer <collection> with a number of `return <item>{thisNode|thatNode|theOtherNode}</item>` inside it.
I understand that the new query might be creating some new entities but compared to the element content, these few extra characters are not THAT many more.
The query jumps from ~1500 ms when using plain XML, to ~55000ms with the addition of the collection, item nodes, to ~57000ms with the addition of CSV exporting via the CSV module. These are "informal average" values. So, I have not run the same query a few times and then obtain the average, but that's the sort of vicinity I have seen numbers in from the times I have run the queries so far.
The database itself is "static", there are no update/insert transactions at the moment, the only thing that I am trying to do is extract some data in a different format from it.
I have Text, Attribute and Token indexes on that database (optimised right after importing) but no further options enabled. I also have not experimented with the SPLITSIZE (?). I have 32GB of memory and it should be enough to handle this 2GB test dataset (?). I will have a go with DEBUG on.
Did you have to enable any additional options for indexes to work faster?
All the best
-----Original Message----- From: basex-talk-bounces@mailman.uni-konstanz.de [mailto:basex-talk-bounces@mailman.uni-konstanz.de] On Behalf Of Fabrice ETANCHAUD Sent: 15 September 2017 13:27 To: basex-talk@mailman.uni-konstanz.de Subject: Re: [basex-talk] Basex Inner Workings
Hi Athanasios,
Did you experience slow queries ? Are you sure to use all the index features ? Are these queries operational ones (direct access on a key value) or analytics ?
I never experienced slow queries, even on huge xml corpus (patent registrations), But this is at the cost of longer indexing times on updates.
Best regards,
-----Message d'origine----- De : basex-talk-bounces@mailman.uni-konstanz.de [mailto:basex-talk-bounces@mailman.uni-konstanz.de] De la part de Anastasiou A. Envoyé : vendredi 15 septembre 2017 14:01 À : basex-talk@mailman.uni-konstanz.de Objet : [basex-talk] Basex Inner Workings
Hello everyone
Quick question: Is there any document / URL where I could find out more about how does Basex access the disk during its operation?
For example, are there any reads to be expected during executing a query?
Through iotop, I can see 3-4 processes reading during startup, then another 2, very briefly firing when opening the database and then during querying there are periodic writes (?) but of very brief duration.
I was wondering if there is anything that could be done from the point of view of the hardware to speed up queries (?) (except a more powerful machine at the moment)
All the best Athanasios Anastasiou