I a new baseX user and I have a few questions regarding the db:optimize command.
Does 'optimize all' perform all available optimizations or are there other parameters (indexing or full-text as mentioned at http://docs.basex.org/wiki/Database_Module#db:optimize) to supply?
Is read/write possible to a database undergoing optimization? I’m guessing no because I get an error (“Database ‘foo’ is currently opened by another process”) whenever another connection to the database is open.
On what size or content of databases does optimization have a significant impact? Currently on a 400Mb database, I don't see any performance benefits after optimization.
(Note: all of my performance testing is using a simple db:list command.)
Thanks in advance for any assistance you can provide.
Joe
Hi Joe,
Welcome to the list!
Does 'optimize all' perform all available optimizations or are there other parameters (indexing or full-text as mentioned at http://docs.basex.org/wiki/Database_Module#db:optimize) to supply?
If optimize is called (with and without all), the existing index structures will be updated. Via db:optimize, you can additionally, enable or disable specific index structures. If you run 'optimize all', the database will be completely rebuilt.
Is read/write possible to a database undergoing optimization? I’m guessing no because I get an error (“Database ‘foo’ is currently opened by another process”) whenever another connection to the database is open.
Right. This error indicates that you should always access a database within the same JVM context (unless you restrict yourself to read operations) [1].
On what size or content of databases does optimization have a significant impact? Currently on a 400Mb database, I don't see any performance benefits after optimization.
Optimizations are only required after update operations. A newly created database is fully optimized. If you believe that specific queries should run faster, feel free to provide us with the query strings.
Cheers Christian
[1] http://docs.basex.org/wiki/Startup#Concurrent_Operations
Thanks for the guidance, Christian. I do have two short follow-up questions:
Does an optimized db (via db:optimize) perform any differently than an un-optimized db for document retrieval, as opposed to XQuery execution over document(s)?
Also, I noticed that I would see the “Database ‘foo’ is currently opened by another process” error message even if there was only had one *other* db connection, within the same JVM. Is this expected? Or would you think that there must be another process (separate JVM) connected to the db?
Thanks again,
Joe
On 7/30/16, 4:22 PM, "Christian Grün" christian.gruen@gmail.com wrote:
Hi Joe,
Welcome to the list!
> Does 'optimize all' perform all available optimizations or are there other > parameters (indexing or full-text as mentioned at > http://docs.basex.org/wiki/Database_Module#db:optimize) to supply?
If optimize is called (with and without all), the existing index structures will be updated. Via db:optimize, you can additionally, enable or disable specific index structures. If you run 'optimize all', the database will be completely rebuilt.
> Is read/write possible to a database undergoing optimization? I’m guessing > no because I get an error (“Database ‘foo’ is currently opened by another > process”) whenever another connection to the database is open.
Right. This error indicates that you should always access a database within the same JVM context (unless you restrict yourself to read operations) [1].
> On what size or content of databases does optimization have a significant > impact? Currently on a 400Mb database, I don't see any performance benefits > after optimization.
Optimizations are only required after update operations. A newly created database is fully optimized. If you believe that specific queries should run faster, feel free to provide us with the query strings.
Cheers Christian
[1] http://docs.basex.org/wiki/Startup#Concurrent_Operations
Hi Joe,
Does an optimized db (via db:optimize) perform any differently than an un-optimized db for document retrieval, as opposed to XQuery execution over document(s)?
If you perform updates on a database, your queries may get slower over time. db:optimize can be used to speed up queries again. Does that answer your query? I must confess I do not exactly understand which difference you see between "document retrieval" and "XQuery execution over document(s)"; could you please give me some more hints?
Also, I noticed that I would see the “Database ‘foo’ is currently opened by another process” error message even if there was only had one *other* db connection, within the same JVM. Is this expected? Or would you think that there must be another process (separate JVM) connected to the db?
There must be at least one other process locking the accessed database (see [1] for more info). Did you open the GUI at the same time?
Christian
[1] http://docs.basex.org/wiki/Transaction_Management#Database_Locks
Thanks, Christian. I’ll try to be more specific…
Does an optimized database perform better than an unoptimized database for db:open? How about for db:list? Or db:retrieve?
As for the another process locking the accessed database, is it possible to have multiple, open db connections, within the same JVM, and use one of them to optimize and another to read from the db? I’m asking because my testing indicates that I can only have one open connection if I want to run db:optimize. If I have two open connections (again, within the same JVM), then my db:optimize will fail.
Thanks,
Joe
On 8/1/16, 4:51 PM, "Christian Grün" christian.gruen@gmail.com wrote:
Hi Joe,
> Does an optimized db (via db:optimize) perform any differently than an un-optimized db for document retrieval, as opposed to XQuery execution over document(s)?
If you perform updates on a database, your queries may get slower over time. db:optimize can be used to speed up queries again. Does that answer your query? I must confess I do not exactly understand which difference you see between "document retrieval" and "XQuery execution over document(s)"; could you please give me some more hints?
> Also, I noticed that I would see the “Database ‘foo’ is currently opened by another process” error message even if there was only had one *other* db connection, within the same JVM. Is this expected? Or would you think that there must be another process (separate JVM) connected to the db?
There must be at least one other process locking the accessed database (see [1] for more info). Did you open the GUI at the same time?
Christian
[1] http://docs.basex.org/wiki/Transaction_Management#Database_Locks
Hi Joseph,
Does an optimized database perform better than an unoptimized database for db:open? How about for db:list? Or db:retrieve?
It strongly depends on the update operations you perform. The most tangible difference between optimized and updated databases will be that your (non-incremental) index structures get lost [1]. When it comes to pure retrieval of database resources, there should hardly be a difference. In our article on indexes [2], you will see which index structures are affected by update operations.
As for the another process locking the accessed database, is it possible to have multiple, open db connections, within the same JVM.
When talking about open connections, do you use the client/server of BaseX [3]?
Christian
[1] http://docs.basex.org/wiki/XQuery_Update#Indexes [2] http://docs.basex.org/wiki/Indexes [3] http://docs.basex.org/wiki/Startup
Great answer regarding resource retrieval and optimization. Thanks, Christian!
Regarding the possibility of multiple db connections (w/in same JVM) and using one of them to db:optimize, we are using the client/server configuration.
Joe
On 8/2/16, 11:51 AM, "Christian Grün" christian.gruen@gmail.com wrote:
Hi Joseph,
> Does an optimized database perform better than an unoptimized database for db:open? How about for db:list? Or db:retrieve?
It strongly depends on the update operations you perform. The most tangible difference between optimized and updated databases will be that your (non-incremental) index structures get lost [1]. When it comes to pure retrieval of database resources, there should hardly be a difference. In our article on indexes [2], you will see which index structures are affected by update operations.
> As for the another process locking the accessed database, is it possible to have multiple, open db connections, within the same JVM.
When talking about open connections, do you use the client/server of BaseX [3]?
Christian
[1] http://docs.basex.org/wiki/XQuery_Update#Indexes [2] http://docs.basex.org/wiki/Indexes [3] http://docs.basex.org/wiki/Startup
Regarding the possibility of multiple db connections (w/in same JVM) and using one of them to db:optimize, we are using the client/server configuration.
If you use database clients, all conflicting transactions will be queued. Here’s yet another Wiki article that gives you insight into the details [1].
Christian
[1] http://docs.basex.org/wiki/Transaction_Management
On 8/2/16, 11:51 AM, "Christian Grün" christian.gruen@gmail.com wrote:
Hi Joseph, > Does an optimized database perform better than an unoptimized database for db:open? How about for db:list? Or db:retrieve? It strongly depends on the update operations you perform. The most tangible difference between optimized and updated databases will be that your (non-incremental) index structures get lost [1]. When it comes to pure retrieval of database resources, there should hardly be a difference. In our article on indexes [2], you will see which index structures are affected by update operations. > As for the another process locking the accessed database, is it possible to have multiple, open db connections, within the same JVM. When talking about open connections, do you use the client/server of BaseX [3]? Christian [1] http://docs.basex.org/wiki/XQuery_Update#Indexes [2] http://docs.basex.org/wiki/Indexes [3] http://docs.basex.org/wiki/Startup
basex-talk@mailman.uni-konstanz.de