Hello Eliot and Tamara,
I’ve observed what appears to be – though haven’t fully tested to isolate and confirm this – instances where a write operation such as db:create() blocks BaseX from serving other http requests -- which use
db:list() and db:get() -- until the write operation is finished.
On reading the description of lock detection here
https://docs.basex.org/main/BaseX_10#compilation I’m now wondering if it might help to apply a naming convention to database names so that it’s possible to distinguish by name which databases are
currently used for read vs write – although renaming databases might add other complexities.
Thanks,
Vincent
_____________________________________________
Vincent M. Lizzi
Head of Information Standards | Taylor & Francis Group
vincent.lizzi@taylorandfrancis.com
Information Classification: General
From: BaseX-Talk <basex-talk-bounces@mailman.uni-konstanz.de>
On Behalf Of Tamara Marnell
Sent: Thursday, December 12, 2024 12:56 PM
To: Eliot Kimber <eliot.kimber@servicenow.com>
Cc: basex-talk@mailman.uni-konstanz.de
Subject: Re: [basex-talk] Deeper discussion of BaseX client/server and web app implementation?
Hello Eliot,
I have only one BaseX instance, but to avoid the locking issue during large updates/optimizations, I have multiple copies of the databases. Updates are performed on "working" databases, and then I use db:copy to duplicate them to "production"
databases for users on the front end to query. I haven't seen or heard of any problems with concurrent users on the public side when they're just reading from the production databases.
-Tamara
On Thu, Dec 12, 2024 at 6:53 AM Eliot Kimber <eliot.kimber@servicenow.com> wrote:
I fully understand the issue of time.
The Database Server page (https://docs.basex.org/12/Database_Server) doesn’t really provide the details I’m looking for.
In particular, it’s not clear to me how a BaseX server would be used with an HTTP server in order to manage parallel query execution and ensure a responsive web site in the face of 100s of concurrent web users making 1000s of query requests. My current architecture handles this in terms of responsiveness and horizontal scaling, but as you say, it runs into issues with contention on locks for databases being updated.
I know other people have successfully implemented public-facing web sites with BaseX so I’m curious how they’ve done it—is the life cycle of their content such that updates are not much of an issue or are they doing something different? Am I missing some way to make a single BaseX server take advantage of all available cores? I understood a Java JVM as using a single core, but maybe my understanding is wrong?
It may be that BaseX as I’m using it is not the right way to do what I’m doing. For example, it might make more sense to implement the web site using a typical node.js and React system that then uses BaseX exclusively through a REST API. That still presents the problem of how to scale handling of queries but avoids any issues with the web site itself being responsive. My team is learning how to use node.js, next.js, and React for other projects so it’s something we could explore.
I could also explore using other database solutions for some or all of what I want to do. For example, maybe it makes more sense to put my where-used table into a key-value store (even Solr could work for this pretty easily) or a SQL database and reserve BaseX for doing the XML-aware data processing needed to construct the table and doing other XML- and text-aware queries. But that would still run into performance issues, where I’m looking for 10ms response times for doing lookups in the where-used table.
Or maybe I just need to do more caching of query results where the results are stable for a given content set.
I started this project without any particular plan and got a long way just building it as I went but now that I’m tasked with fixing a number of design and behavior issues with my initial approach, I need to make sure I really know what I’m doing and make the most appropriate implementation choices.
Thanks,
Eliot
_____________________________________________
Eliot Kimber
Sr. Staff Content Engineer
O: 512 554 9368
servicenow
LinkedIn | X | YouTube | Instagram
--
Tamara Marnell
Program Manager, Systems
Orbis Cascade Alliance (orbiscascade.org)
Pronouns: she/her/hers