Hello all,
for the first time I encountered a scenario which runs according to an
"insertion intensive" pattern. A lot of documents are stored into a
database at a rather fast pace.
With the computing resources I currently have, the process slows down
rather quickly since indexes (created with the "create index" command)
got invalidated immediately.
I moved to the UPDINDEX option and this alleviated the slowing down but
made my "small" system run quickly into disk space issues since the data
files grow at a much higher rate (5.8 GB after one day of continuous work).
I can fix this by running the optimize all command. I was not very
comfortable with this command so I stopped everything and made a cold
run. The result was surprisingly good. The amount of disk space got
reduced to 1% of the size it had before the optimize and it also took a
small amount of time compared to the dimension of the input data.
My first question is whether running optimize command regularly (maybe
every hour or so) without stopping the flow of incoming documents is a
safe way to go or whether it will impede somehow the workload or
introduce any risks on data integrity.
Secondly, to improve the workload on a single database (thus reducing
the frequency of optimize calls), I was planning to split my data moving
less important data to a different database (but same server). Will this
be an effective solution in this scenario?
Sorry for the possibly naive questions but I'm really not a very expert
in this quantitative aspects.
Thanks and regards,
Marco.