Hello Andreas,

was this database renaming+creation triggered by the client or autonomously done by the server? The latter alternative would require some scheduling feature, so that the database server could be configured to perform such actions according to a schedule. I think this would be a very desirable feature, especially as long as high frequency storage and querying cannot be done well concurrently. Anyway, do you see a way to implement this periodic renaming+creation behaviour in a multi-client environment?

Concerning the querying tasks to be expected I know very little myself - except for the need to group messages by transaction IDs and then somehow evaluate the execution of transactions. But at the moment I suppose that your "monster document" approach might be appropriate in our case, too.

Kind regards,
Hans-Juergen


Von: Andreas Weiler <andreas.weiler@uni-konstanz.de>
An: Hans-Juergen Rennau <hrennau@yahoo.de>
CC: Base X <basex-talk@mailman.uni-konstanz.de>
Gesendet: 9:45 Mittwoch, 4.Juli 2012
Betreff: Re: [basex-talk] BaseX as a log msg store?

Hello Hans-Juergen,

So my understanding is that the messages are inserted as child elements into this root element - and the end result is one document with one root element and millions of child elements representing the invidual messages, yes? 

Yes that is correct, i have one root element at the beginning and insert the incoming items as child nodes of the root.

Therefore you do not have to come up with URIs, as there is only one single document. A monster document, but I conclude from your approach that this is no problem, and not worse (or even better) than having a million individual, small documents. Is it correct - would you recommend to store the messages in one single document?

In my use case, tweets have unique id attributes, so i don't need any URIs to identify them. Probably, it is a good idea if you describe your further querying process so it is easier to understand what you want to do.

If the loading process cannot concur with queries - would there be any way how one could periodically "shift" packages of messages into a "read only" database? Or perhaps better the other way around, let the server periodically interrupt its loading activity, close the database, rename it, open and initialize a new base and then continue to load? Or is there presently simply no solution available?

Thats exactly what i do after each hour. I rename the current db with the current date_hour and create a new database for the next incoming items. Shifting is not really an alternative, cause it will probably take too long to insert the items into a second database and delete them from the "main" database.

Kind regards,
Andreas