Hello Hans-Juergen,
So my understanding is that the messages are inserted as child elements into this root element - and the end result is one document with one root element and millions of child elements representing the invidual messages, yes?
Yes that is correct, i have one root element at the beginning and insert the incoming items as child nodes of the root.
Therefore you do not have to come up with URIs, as there is only one single document. A monster document, but I conclude from your approach that
this is no problem, and not worse (or even better) than having a million individual, small documents. Is it correct - would you recommend to store the messages in one single document?
In my use case, tweets have unique id attributes, so i don't need any URIs to identify them. Probably, it is a good idea if you describe your further querying process so it is easier to understand what you want to do.
If the loading process cannot concur with queries - would there be any way how one could periodically "shift" packages of messages into a "read only" database? Or perhaps better the other way around, let the server periodically interrupt its loading activity, close the database, rename it, open and initialize a new base
and then continue to load? Or is there presently simply no solution available?
Thats exactly what i do after each hour. I rename the current db with the current date_hour and create a new database for the next incoming items. Shifting is not really an alternative, cause it will probably take too long to insert the items into a second database and delete them from the "main" database.
Kind regards,
Andreas