Hi Christian,
Thank you for this - looks very promising.
I was also having a think and wondered if, assuming a full fix is difficult, a special optimising function would be fast and easy. Instead of rebuilding the index content by reading the database just rebuild the files eliminating the free space - rather like a disk defragmenter. Users could then choose when is the optimum time to run the function (every transaction if they so chose) but wouldn’t need to rebuild the index just to regain disk space.
The ‘current’ index could still be used for read operations during the defragmentation so I think you’d just need a database write lock for the period while the new file was created and written. What I don’t know is how long optimising the file would take versus the time to reindex using OPTIMIZE but I would think that for larger indexes it could be a good time saving. I also don’t know the interaction between memory and the copy of the file on disk - I guess we’d have to replace what’s in memory as well as the file.
I was going to make up a proof of concept but I’m sorry I haven’t had time yet. I wonder if I could do it in XQuery.. :)
Do let me know if I can help testing any snapshots or similar.
Regards, James
On 30 Jul 2014, at 14:44, Christian Grün christian.gruen@gmail.com wrote:
Hi James,
I had some first thoughts on possible optimizations for the increasing file size problem, and I may have found a fairly easy solution that covers some of the current problems. It's not implemented yet, but I could at least fix the initial 4096 byte problem [1].
I'll keep you updated, Christian
[1] https://github.com/BaseXdb/basex/issues/970
On Sat, Jul 19, 2014 at 12:06 PM, Christian Grün christian.gruen@gmail.com wrote:
Hi James,
However the behaviour is different when using db:replace. I think it's doing a db:delete() and then a db:add(). So first the index file has the ID list for that attribute value rewritten in place (so the count will go from 2048 to 2047 for example) with a new value for count and just the remaining IDs once the document being replaced is removed. The now unused bytes at the end are left with their previous values. Then a completely new ID list is written to the end of the file (now with the count back up to 2048 for example) as the replacement attribute is added.
That's a good hint, and (as you already guessed) it's due to the current semantics of our replace operation [1]. As a replaced document may contain a completely different structure and contents, it would probably be tricky to replace ID lists on a lower level (instead of deleting and adding them). One plan to solve the issues could be a data structure that remembers free slots in the heap file, which can later be filled up with new entries.
[As a note: there seems to be a small bug when UPDINDEX is true in that a index file is always at least 4096 bytes. When an empty database is created the index file will be 4096 zero bytes with updates appended to the end. Even if you optimize the file will be padded to 4096 bytes with zeros.]
Thanks, I will remember that. Maybe the minimum of 4096 bytes will stay, but it should definitely be overwritten from the very beginning when new data is inserted.
I'd love to be able to do everything with UPDINDEX set to true and just forget about it.
Me too ;) Let's see when it can be done.
How fixed is the index file format? I ask because I've spent some time understanding how it works so I can read the files and see exactly what's in them. If it would be useful then I'm happy to put the information into the wiki somewhere to make it quicker for anyone else who's interested. However if you want to keep the structure obscure for any reason then I won't publish anything. Let me know.
Thanks, contributions like that are always appreciated! The storage structure is supposed to be open to everyone. I guess you have already stumbled upon [3] and [4]; all edits are welcome, and may motivate others to think about better solutions.
Christian
[1] https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/ba... [2] https://github.com/BaseXdb/basex/issues/970 [3] http://docs.basex.org/wiki/Storage_Layout [4] http://docs.basex.org/wiki/Node_Storage