I now have my server loaded with lots of data and I’m seeing that when the server starts it is taking 100% of a processor for 10s of minutes or more.
I expect some initial startup cost of course as BaseX prepares whatever it needs to satisfy queries but this feels like something is not right.
I don’t see anything obvious in the logs but I suspect I’m not looking in the right place or don’t have appropriate logging options turned on.
Any tips on what I can look for to see what might be causing this startup issue?
Thanks,
E.
_____________________________________________ Eliot Kimber Sr Staff Content Engineer O: 512 554 9368 M: 512 554 9368 servicenow.comhttps://www.servicenow.com LinkedInhttps://www.linkedin.com/company/servicenow | Twitterhttps://twitter.com/servicenow | YouTubehttps://www.youtube.com/user/servicenowinc | Facebookhttps://www.facebook.com/servicenow
Usually, BaseX should be available immediately after startup. How do you start BaseX? Have you enabled any services [1] that will run at startup time, or similar?
[1] https://docs.basex.org/wiki/Jobs_Module#Services
On Wed, Feb 9, 2022 at 1:45 PM Eliot Kimber eliot.kimber@servicenow.com wrote:
I now have my server loaded with lots of data and I’m seeing that when the server starts it is taking 100% of a processor for 10s of minutes or more.
I expect some initial startup cost of course as BaseX prepares whatever it needs to satisfy queries but this feels like something is not right.
I don’t see anything obvious in the logs but I suspect I’m not looking in the right place or don’t have appropriate logging options turned on.
Any tips on what I can look for to see what might be causing this startup issue?
Thanks,
E.
Eliot Kimber
Sr Staff Content Engineer
O: 512 554 9368
M: 512 554 9368
servicenow.com
LinkedIn | Twitter | YouTube | Facebook
That’s what I figured—that it should start quickly and be immediately available, so if it’s not something must be wrong.
At the moment I’m just firing up basehttp using the built-in script, so no on-startup commands or job services (yet).
My guess is that I have a corrupt database or something but I thought there might also be some “recover from a bad shutdown” process that happens. Our validation reports are in a single database and each doc is around 40MB.
I dropped my validation report database and then BaseX was able to start as expected. Fortunately, we keep those reports in a git repo so there’s no loss if a database goes bad (and I’m not yet bothering to back up my databases).
I’m working now on setting up parallel servers so I can have one do the data processing while the other serves pages.
If I understand the docs around file locks and so on, it is safe to have databases shared between servers for read and as long as writes are done from the data processing server the page-serving server should be good.
Cheers,
E.
_____________________________________________ Eliot Kimber Sr Staff Content Engineer O: 512 554 9368 M: 512 554 9368 servicenow.comhttps://www.servicenow.com LinkedInhttps://www.linkedin.com/company/servicenow | Twitterhttps://twitter.com/servicenow | YouTubehttps://www.youtube.com/user/servicenowinc | Facebookhttps://www.facebook.com/servicenow
From: Christian Grün christian.gruen@gmail.com Date: Wednesday, February 9, 2022 at 10:07 AM To: Eliot Kimber eliot.kimber@servicenow.com Cc: basex-talk@mailman.uni-konstanz.de basex-talk@mailman.uni-konstanz.de Subject: Re: [basex-talk] BaseX Doing Lots of Work on Startup: How to Diagnose? [External Email]
Usually, BaseX should be available immediately after startup. How do you start BaseX? Have you enabled any services [1] that will run at startup time, or similar?
[1] https://urldefense.com/v3/__https://docs.basex.org/wiki/Jobs_Module*Services...https://urldefense.com/v3/__https:/docs.basex.org/wiki/Jobs_Module*Services__;Iw!!N4vogdjhuJM!VpBP92sH35aW9BDbyKfSH-axiIdYsZ3I42LHOxHd_5n0VBj5dfOHONVGUdOkov-NpMhTXg$
On Wed, Feb 9, 2022 at 1:45 PM Eliot Kimber eliot.kimber@servicenow.com wrote:
I now have my server loaded with lots of data and I’m seeing that when the server starts it is taking 100% of a processor for 10s of minutes or more.
I expect some initial startup cost of course as BaseX prepares whatever it needs to satisfy queries but this feels like something is not right.
I don’t see anything obvious in the logs but I suspect I’m not looking in the right place or don’t have appropriate logging options turned on.
Any tips on what I can look for to see what might be causing this startup issue?
Thanks,
E.
Eliot Kimber
Sr Staff Content Engineer
O: 512 554 9368
M: 512 554 9368
servicenow.com
LinkedIn | Twitter | YouTube | Facebook
I dropped my validation report database and then BaseX was able to start as expected.
That’s unusual. No databases should be opened or initialized if you start basexhttp. The only files that are parsed are .basex and (optional, if they exist) data/users.xml and data/jobs.xml. Feel free to share more information with us if you can.
If I understand the docs around file locks and so on, it is safe to have databases shared between servers for read and as long as writes are done from the data processing server the page-serving server should be good.
Yes, if you know exactly what you’ll be doing, it’s feasible to let multiple BaseX instances run on the same database folder. As long as you don’t access databases that you create, update or delete via another instance, you should be fine.
On Wed, 2022-02-09 at 16:15 +0000, Eliot Kimber wrote:
That’s what I figured—that it should start quickly and be immediately available, so if it’s not something must be wrong.
At the moment I’m just firing up basehttp using the built-in script, so no on-startup commands or job services (yet).
For what it's worth i have a crontab entry on fromoldbooks.org,
*/5 * * * * cd /home/liam/f/Search/ && /home/liam/packages/basex/basex/bin/basexserver -z -n127.0.0.1 > /dev/null 2>&1
The /dev/null is because if there are errors, they're likely Java exceptions which could fill the disk :) as well as an error message if the server is already running (which cron would email to me!) and i'll diagnose by running the server directly. You may want to use a logfile instead. The -z option suppresses BaseX's own logging.
The -n127.0.0.1 makes BaseX listen only on the local interface, because i connect to it with a separate front end predating RESTXQ :)
The */5 at the start says to run the command every 5 minutes; it's however much downtown is acceptable, or delay after a reboot.
/home/liam/packages/basex/basex is a symbolic link to the current version, in this case BaseX95, so i can switch easily.
An alternative would be to write a systemd service file for BaseX, to let it run as a system service, but i prefer to keep as much in user space as possible and not modify system files or directories, to make it easier to migrate to a different computer later.
Liam
Liam
Liam,
Thanks for that tip—I was looking this morning at making BaseX a service.
So if I understand your cron job, it just tries to start BaseX, which if it’s already started will have no effect (other than emitting the messages you send to /dev/null.
Cheers,
E.
_____________________________________________ Eliot Kimber Sr Staff Content Engineer O: 512 554 9368 M: 512 554 9368 servicenow.comhttps://www.servicenow.com LinkedInhttps://www.linkedin.com/company/servicenow | Twitterhttps://twitter.com/servicenow | YouTubehttps://www.youtube.com/user/servicenowinc | Facebookhttps://www.facebook.com/servicenow
From: BaseX-Talk basex-talk-bounces@mailman.uni-konstanz.de on behalf of Liam R. E. Quin liam@fromoldbooks.org Date: Wednesday, February 9, 2022 at 10:39 AM To: basex-talk@mailman.uni-konstanz.de basex-talk@mailman.uni-konstanz.de Subject: Re: [basex-talk] BaseX Doing Lots of Work on Startup: How to Diagnose? [External Email]
On Wed, 2022-02-09 at 16:15 +0000, Eliot Kimber wrote:
That’s what I figured—that it should start quickly and be immediately available, so if it’s not something must be wrong.
At the moment I’m just firing up basehttp using the built-in script, so no on-startup commands or job services (yet).
For what it's worth i have a crontab entry on fromoldbooks.org,
*/5 * * * * cd /home/liam/f/Search/ && /home/liam/packages/basex/basex/bin/basexserver -z -n127.0.0.1 > /dev/null 2>&1
The /dev/null is because if there are errors, they're likely Java exceptions which could fill the disk :) as well as an error message if the server is already running (which cron would email to me!) and i'll diagnose by running the server directly. You may want to use a logfile instead. The -z option suppresses BaseX's own logging.
The -n127.0.0.1 makes BaseX listen only on the local interface, because i connect to it with a separate front end predating RESTXQ :)
The */5 at the start says to run the command every 5 minutes; it's however much downtown is acceptable, or delay after a reboot.
/home/liam/packages/basex/basex is a symbolic link to the current version, in this case BaseX95, so i can switch easily.
An alternative would be to write a systemd service file for BaseX, to let it run as a system service, but i prefer to keep as much in user space as possible and not modify system files or directories, to make it easier to migrate to a different computer later.
Liam
Liam
-- Liam Quin, https://urldefense.com/v3/__https://www.delightfulcomputing.com/__;!!N4vogdj...https://urldefense.com/v3/__https:/www.delightfulcomputing.com/__;!!N4vogdjhuJM!Tesq4xndDV4_mhtYnkrrKIMq-aeflNLL8mcQG8oWM8r1ekswvo5L896fSOwD3h6Iesrfrg$ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Barefoot Web-slave, antique illustrations: https://urldefense.com/v3/__http://www.fromoldbooks.org__;!!N4vogdjhuJM!Tesq...https://urldefense.com/v3/__http:/www.fromoldbooks.org__;!!N4vogdjhuJM!Tesq4xndDV4_mhtYnkrrKIMq-aeflNLL8mcQG8oWM8r1ekswvo5L896fSOwD3h6G07tTig$
On Wed, 2022-02-09 at 17:17 +0000, Eliot Kimber wrote:
So if I understand your cron job, it just tries to start BaseX, which if it’s already started will have no effect (other than emitting the messages you send to /dev/null.
Right.
I can report that I know have what appears to be a stable BaseX system of multiple instances, one to serve pages and one or more to do data processing.
I’ve set up XQuery code to manage using temporary databases to work in and then swap those into production. It seems to work although it’s still too new to know if I’ve missed out some details around locking. I also made my databases more granular so that I can have multiple worker instances handling different parts of our data. We have four active streams of development for the ServiceNow product documentation (“families” is the SN term for our product versions) so it makes sense to have one instance per family. It’s also clear that if I had the ability to use Docker containers that one could horizontally scale really well with a little bit more effort.
I also put Liam’s cron trick in place and my server was running when I got up this a.m., so thanks for that.
Cheers,
E.
_____________________________________________ Eliot Kimber Sr Staff Content Engineer O: 512 554 9368 M: 512 554 9368 servicenow.comhttps://www.servicenow.com LinkedInhttps://www.linkedin.com/company/servicenow | Twitterhttps://twitter.com/servicenow | YouTubehttps://www.youtube.com/user/servicenowinc | Facebookhttps://www.facebook.com/servicenow
From: Liam R. E. Quin liam@fromoldbooks.org Date: Wednesday, February 9, 2022 at 3:05 PM To: Eliot Kimber eliot.kimber@servicenow.com, basex-talk@mailman.uni-konstanz.de basex-talk@mailman.uni-konstanz.de Subject: Re: [basex-talk] BaseX Doing Lots of Work on Startup: How to Diagnose? [External Email]
On Wed, 2022-02-09 at 17:17 +0000, Eliot Kimber wrote:
So if I understand your cron job, it just tries to start BaseX, which if it’s already started will have no effect (other than emitting the messages you send to /dev/null.
Right.
-- Liam Quin, https://urldefense.com/v3/__https://www.delightfulcomputing.com/__;!!N4vogdj...https://urldefense.com/v3/__https:/www.delightfulcomputing.com/__;!!N4vogdjhuJM!R6821SX0l0VkuInKbjsLK0mGKYT73oHr2fv8PlnbR_WPyCBVPCUVP3FUq5teSF-NyghqEg$ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Barefoot Web-slave, antique illustrations: https://urldefense.com/v3/__http://www.fromoldbooks.org__;!!N4vogdjhuJM!R682...https://urldefense.com/v3/__http:/www.fromoldbooks.org__;!!N4vogdjhuJM!R6821SX0l0VkuInKbjsLK0mGKYT73oHr2fv8PlnbR_WPyCBVPCUVP3FUq5teSF8Ai01CbQ$
basex-talk@mailman.uni-konstanz.de