Re: [basex-talk] Improving performance in a 40GB database

5 Jul 2016

      Hi James,
...
Individual OCR'd words on pages maybe comprise around 85% of the data - and I don't actually care about this data. So maybe if I just don't load these OCR'd words it will help? I haven't tried that yet, but ideally I'd like not to have to do it.
Some (more or less obvious) questions back:

* How large is the resulting XML document (around 15% of the original document)?
* How do you measure the time?
* Do you store the result on disk?
* How long does your query take if you wrap it into a count(...) or
prof:void(...) function?

Thanks in advance,
Christian

Re: [basex-talk] Improving performance in a 40GB database

Christian Grün