Graydon, That is what I was afraid you were going to say. It is a low limit. (It is too bad too, since it prevents us from using this parser feature to determine how deep entity nesting actually goes.) On the other hand, as you are saying, flat entities can usually also be normalized away early. Thanks! Wendell On Thu, Jan 8, 2026 at 9:20 AM Graydon Saunders via BaseX-Talk < basex-talk@mailman.uni-konstanz.de> wrote:
Hi Wendell,
It's the total number of entities.
The document where I first had this experience has about five thousand instances of the non-breaking-space HTML entity in it; converting those to actual non-breaking spaces made the problem go away. No nested entities whatsoever.
-- Graydon
On Thu, Jan 8, 2026, at 09:14, Wendell Piez via BaseX-Talk wrote:
Hello Christian,
Please forgive a slightly OT question, for background - is the parser limiting the expansion of entities with respect to their total count, or only with respect to their nesting (one entity invoking another) i.e. invocation depth? (So I can have more than 2500 entities, just not 2500 deep.)
I kind of assumed it was the invocation depth, am I wrong? Or do parsers have settings for both?
(With thanks to you, Graydon and the list for the public conversation.)
Regards, Wendell
On Thu, Jan 8, 2026 at 4:13 AM Christian Grün via BaseX-Talk < basex-talk@mailman.uni-konstanz.de> wrote:
Hi Graydon,
You are right, Java imposes various limits on the XML parser that get stricter and more fine granular with every version of the language [1].
Currently, there are two ways to tackle this:
• The properties can be overwritten when starting BaseX on command line, for example:
-Djdk.xml.maxGeneralEntitySizeLimit=0 -Djdk.xml.totalEntitySizeLimit=0
The properties can be added to the BaseX start scripts or assigned to the BASEX_JVM environment variable before starting BaseX.
• You can use our internal BaseX XML parser, either by enabling the INTPARSE option, or by switching to the »Parsing« tab in the »Create Database« dialog of the GUI and activating the corresponding checkbox.
In a future version of BaseX, we may introduce a global option to invalidate the limits. As BaseX is a tool for XML experts, we could also invalidate the Java options by default. Feedback from everyone is welcome.
Best, Christian
[1] https://docs.oracle.com/en/java/javase/25/docs/api/java.xml/module-summary.h...
------------------------------
*Von:* Graydon Saunders via BaseX-Talk <basex-talk@mailman.uni-konstanz.de
*Gesendet:* Dienstag, 6. Januar 2026 18:11 *An:* BaseX <basex-talk@mailman.uni-konstanz.de> *Betreff:* [basex-talk] the right way to respond to JAXP00010001
Hello (and Happy New Year!)
I'm on Linux (Fedora) using BaseX 12.1 and OpenJDK Runtime Environment (Red_Hat-25.0.1.0.8-3) (build 25.0.1+8)
I've got some data, which I want BaseX to load by the individual document using the doc() function.
On one of these documents I get a parsing failure that reports JAXP00010001; if I look that up, I find https://www.oracle.com/java/technologies/javase/24-relnote-issues.html which says this limit changed (from a traditional larger number) to 2500, so now the error text is
JAXP00010001: The parser has encountered more than "2500" entity expansions in this document; this is the limit imposed by the JDK
It's a large file and I can't do anything about that part, nor can I do anything about the number of entity references these files happen to have when I get them. (In this particular case, a bit more than five thousand.) The Oracle page lists a bunch of options for how to set a different entity expansion limit.
In context of BaseX, what's the right way to adjust the entity expansion limit (in my case, generally the BaseX GUI) so these files will parse?
Thanks! Graydon
-- ...Wendell Piez... ...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org... ... github.com/wendellpiez...
-- ...Wendell Piez... ...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org... ...github.com/wendellpiez. ..