the right way to respond to JAXP00010001
Hello (and Happy New Year!) I'm on Linux (Fedora) using BaseX 12.1 and OpenJDK Runtime Environment (Red_Hat-25.0.1.0.8-3) (build 25.0.1+8) I've got some data, which I want BaseX to load by the individual document using the doc() function. On one of these documents I get a parsing failure that reports JAXP00010001; if I look that up, I find https://www.oracle.com/java/technologies/javase/24-relnote-issues.html which says this limit changed (from a traditional larger number) to 2500, so now the error text is `JAXP00010001: The parser has encountered more than "2500" entity expansions in this document; this is the limit imposed by the JDK` It's a large file and I can't do anything about that part, nor can I do anything about the number of entity references these files happen to have when I get them. (In this particular case, a bit more than five thousand.) The Oracle page lists a bunch of options for how to set a different entity expansion limit. In context of BaseX, what's the right way to adjust the entity expansion limit (in my case, generally the BaseX GUI) so these files will parse? Thanks! Graydon
Hi Graydon, You are right, Java imposes various limits on the XML parser that get stricter and more fine granular with every version of the language [1]. Currently, there are two ways to tackle this: • The properties can be overwritten when starting BaseX on command line, for example: -Djdk.xml.maxGeneralEntitySizeLimit=0 -Djdk.xml.totalEntitySizeLimit=0 The properties can be added to the BaseX start scripts or assigned to the BASEX_JVM environment variable before starting BaseX. • You can use our internal BaseX XML parser, either by enabling the INTPARSE option, or by switching to the »Parsing« tab in the »Create Database« dialog of the GUI and activating the corresponding checkbox. In a future version of BaseX, we may introduce a global option to invalidate the limits. As BaseX is a tool for XML experts, we could also invalidate the Java options by default. Feedback from everyone is welcome. Best, Christian [1] https://docs.oracle.com/en/java/javase/25/docs/api/java.xml/module-summary.h... ________________________________ Von: Graydon Saunders via BaseX-Talk <basex-talk@mailman.uni-konstanz.de> Gesendet: Dienstag, 6. Januar 2026 18:11 An: BaseX <basex-talk@mailman.uni-konstanz.de> Betreff: [basex-talk] the right way to respond to JAXP00010001 Hello (and Happy New Year!) I'm on Linux (Fedora) using BaseX 12.1 and OpenJDK Runtime Environment (Red_Hat-25.0.1.0.8-3) (build 25.0.1+8) I've got some data, which I want BaseX to load by the individual document using the doc() function. On one of these documents I get a parsing failure that reports JAXP00010001; if I look that up, I find https://www.oracle.com/java/technologies/javase/24-relnote-issues.html which says this limit changed (from a traditional larger number) to 2500, so now the error text is JAXP00010001: The parser has encountered more than "2500" entity expansions in this document; this is the limit imposed by the JDK It's a large file and I can't do anything about that part, nor can I do anything about the number of entity references these files happen to have when I get them. (In this particular case, a bit more than five thousand.) The Oracle page lists a bunch of options for how to set a different entity expansion limit. In context of BaseX, what's the right way to adjust the entity expansion limit (in my case, generally the BaseX GUI) so these files will parse? Thanks! Graydon
Thank you, Christian! I can add those values to the start script and it works. I wind up with
BASEX_JVM="${BASEX_JVM} -Djdk.xml.maxGeneralEntitySizeLimit=0 -Djdk.xml.totalEntitySizeLimit=0" out of not being clear where the default values for BASEX_JVM originates these days. It does seem to be set before basexgui uses it.
As a note, I think I was already using the internal XML parser, though if I go look at the »Parsing« tab in the »Create Database« GUI, "Use internal XML parser" is both checked and greyed out, and "Use XML catalog file" has that value and is both checked and greyed out. It's been like that for a while and I've been avoiding messing with it because I do have to use a catalog sometimes and this configuration has been working. On Thu, Jan 8, 2026, at 04:13, Christian Grün wrote:
Hi Graydon,
You are right, Java imposes various limits on the XML parser that get stricter and more fine granular with every version of the language [1].
Currently, there are two ways to tackle this:
• The properties can be overwritten when starting BaseX on command line, for example:
-Djdk.xml.maxGeneralEntitySizeLimit=0 -Djdk.xml.totalEntitySizeLimit=0
The properties can be added to the BaseX start scripts or assigned to the BASEX_JVM environment variable before starting BaseX.
• You can use our internal BaseX XML parser, either by enabling the INTPARSE option, or by switching to the »Parsing« tab in the »Create Database« dialog of the GUI and activating the corresponding checkbox.
In a future version of BaseX, we may introduce a global option to invalidate the limits. As BaseX is a tool for XML experts, we could also invalidate the Java options by default. Feedback from everyone is welcome.
Best, Christian
[1] https://docs.oracle.com/en/java/javase/25/docs/api/java.xml/module-summary.h...
I wind up with
BASEX_JVM="${BASEX_JVM} -Djdk.xml.maxGeneralEntitySizeLimit=0 -Djdk.xml.totalEntitySizeLimit=0"
out of not being clear where the default values for BASEX_JVM originates these days. It does seem to be set before basexgui uses it.
By default, it is unassigned, but it is possible to assign a value, which will then be adopted by the start script.
As a note, I think I was already using the internal XML parser, though if I go look at the »Parsing« tab in the »Create Database« GUI, "Use internal XML parser" is both checked and greyed out, and "Use XML catalog file" has that value and is both checked and greyed out.
Interesting, this combination is not intended to exist (the internal parser does not support catalog resolution). With the next release, internal parsing will automatically be deselected in the GUI if a catalog is chosen.
Hello Christian, Please forgive a slightly OT question, for background - is the parser limiting the expansion of entities with respect to their total count, or only with respect to their nesting (one entity invoking another) i.e. invocation depth? (So I can have more than 2500 entities, just not 2500 deep.) I kind of assumed it was the invocation depth, am I wrong? Or do parsers have settings for both? (With thanks to you, Graydon and the list for the public conversation.) Regards, Wendell On Thu, Jan 8, 2026 at 4:13 AM Christian Grün via BaseX-Talk < basex-talk@mailman.uni-konstanz.de> wrote:
Hi Graydon,
You are right, Java imposes various limits on the XML parser that get stricter and more fine granular with every version of the language [1].
Currently, there are two ways to tackle this:
• The properties can be overwritten when starting BaseX on command line, for example:
-Djdk.xml.maxGeneralEntitySizeLimit=0 -Djdk.xml.totalEntitySizeLimit=0
The properties can be added to the BaseX start scripts or assigned to the BASEX_JVM environment variable before starting BaseX.
• You can use our internal BaseX XML parser, either by enabling the INTPARSE option, or by switching to the »Parsing« tab in the »Create Database« dialog of the GUI and activating the corresponding checkbox.
In a future version of BaseX, we may introduce a global option to invalidate the limits. As BaseX is a tool for XML experts, we could also invalidate the Java options by default. Feedback from everyone is welcome.
Best, Christian
[1] https://docs.oracle.com/en/java/javase/25/docs/api/java.xml/module-summary.h...
------------------------------ *Von:* Graydon Saunders via BaseX-Talk <basex-talk@mailman.uni-konstanz.de
*Gesendet:* Dienstag, 6. Januar 2026 18:11 *An:* BaseX <basex-talk@mailman.uni-konstanz.de> *Betreff:* [basex-talk] the right way to respond to JAXP00010001
Hello (and Happy New Year!)
I'm on Linux (Fedora) using BaseX 12.1 and OpenJDK Runtime Environment (Red_Hat-25.0.1.0.8-3) (build 25.0.1+8)
I've got some data, which I want BaseX to load by the individual document using the doc() function.
On one of these documents I get a parsing failure that reports JAXP00010001; if I look that up, I find https://www.oracle.com/java/technologies/javase/24-relnote-issues.html which says this limit changed (from a traditional larger number) to 2500, so now the error text is
JAXP00010001: The parser has encountered more than "2500" entity expansions in this document; this is the limit imposed by the JDK
It's a large file and I can't do anything about that part, nor can I do anything about the number of entity references these files happen to have when I get them. (In this particular case, a bit more than five thousand.) The Oracle page lists a bunch of options for how to set a different entity expansion limit.
In context of BaseX, what's the right way to adjust the entity expansion limit (in my case, generally the BaseX GUI) so these files will parse?
Thanks! Graydon
-- ...Wendell Piez... ...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org... ...github.com/wendellpiez. ..
Hi Wendell, It's the total number of entities. The document where I first had this experience has about five thousand instances of the non-breaking-space HTML entity in it; converting those to actual non-breaking spaces made the problem go away. No nested entities whatsoever. -- Graydon On Thu, Jan 8, 2026, at 09:14, Wendell Piez via BaseX-Talk wrote:
Hello Christian,
Please forgive a slightly OT question, for background - is the parser limiting the expansion of entities with respect to their total count, or only with respect to their nesting (one entity invoking another) i.e. invocation depth? (So I can have more than 2500 entities, just not 2500 deep.)
I kind of assumed it was the invocation depth, am I wrong? Or do parsers have settings for both?
(With thanks to you, Graydon and the list for the public conversation.)
Regards, Wendell
On Thu, Jan 8, 2026 at 4:13 AM Christian Grün via BaseX-Talk <basex-talk@mailman.uni-konstanz.de> wrote:
Hi Graydon,
You are right, Java imposes various limits on the XML parser that get stricter and more fine granular with every version of the language [1].
Currently, there are two ways to tackle this:
• The properties can be overwritten when starting BaseX on command line, for example:
-Djdk.xml.maxGeneralEntitySizeLimit=0 -Djdk.xml.totalEntitySizeLimit=0
The properties can be added to the BaseX start scripts or assigned to the BASEX_JVM environment variable before starting BaseX.
• You can use our internal BaseX XML parser, either by enabling the INTPARSE option, or by switching to the »Parsing« tab in the »Create Database« dialog of the GUI and activating the corresponding checkbox.
In a future version of BaseX, we may introduce a global option to invalidate the limits. As BaseX is a tool for XML experts, we could also invalidate the Java options by default. Feedback from everyone is welcome.
Best, Christian
[1] https://docs.oracle.com/en/java/javase/25/docs/api/java.xml/module-summary.h...
*Von:* Graydon Saunders via BaseX-Talk <basex-talk@mailman.uni-konstanz.de> *Gesendet:* Dienstag, 6. Januar 2026 18:11 *An:* BaseX <basex-talk@mailman.uni-konstanz.de> *Betreff:* [basex-talk] the right way to respond to JAXP00010001
Hello (and Happy New Year!)
I'm on Linux (Fedora) using BaseX 12.1 and OpenJDK Runtime Environment (Red_Hat-25.0.1.0.8-3) (build 25.0.1+8)
I've got some data, which I want BaseX to load by the individual document using the doc() function.
On one of these documents I get a parsing failure that reports JAXP00010001; if I look that up, I find https://www.oracle.com/java/technologies/javase/24-relnote-issues.html which says this limit changed (from a traditional larger number) to 2500, so now the error text is `JAXP00010001: The parser has encountered more than "2500" entity expansions in this document; this is the limit imposed by the JDK`
It's a large file and I can't do anything about that part, nor can I do anything about the number of entity references these files happen to have when I get them. (In this particular case, a bit more than five thousand.) The Oracle page lists a bunch of options for how to set a different entity expansion limit.
In context of BaseX, what's the right way to adjust the entity expansion limit (in my case, generally the BaseX GUI) so these files will parse?
Thanks! Graydon
-- ...Wendell Piez... ...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org... ...github.com/wendellpiez...
Graydon, That is what I was afraid you were going to say. It is a low limit. (It is too bad too, since it prevents us from using this parser feature to determine how deep entity nesting actually goes.) On the other hand, as you are saying, flat entities can usually also be normalized away early. Thanks! Wendell On Thu, Jan 8, 2026 at 9:20 AM Graydon Saunders via BaseX-Talk < basex-talk@mailman.uni-konstanz.de> wrote:
Hi Wendell,
It's the total number of entities.
The document where I first had this experience has about five thousand instances of the non-breaking-space HTML entity in it; converting those to actual non-breaking spaces made the problem go away. No nested entities whatsoever.
-- Graydon
On Thu, Jan 8, 2026, at 09:14, Wendell Piez via BaseX-Talk wrote:
Hello Christian,
Please forgive a slightly OT question, for background - is the parser limiting the expansion of entities with respect to their total count, or only with respect to their nesting (one entity invoking another) i.e. invocation depth? (So I can have more than 2500 entities, just not 2500 deep.)
I kind of assumed it was the invocation depth, am I wrong? Or do parsers have settings for both?
(With thanks to you, Graydon and the list for the public conversation.)
Regards, Wendell
On Thu, Jan 8, 2026 at 4:13 AM Christian Grün via BaseX-Talk < basex-talk@mailman.uni-konstanz.de> wrote:
Hi Graydon,
You are right, Java imposes various limits on the XML parser that get stricter and more fine granular with every version of the language [1].
Currently, there are two ways to tackle this:
• The properties can be overwritten when starting BaseX on command line, for example:
-Djdk.xml.maxGeneralEntitySizeLimit=0 -Djdk.xml.totalEntitySizeLimit=0
The properties can be added to the BaseX start scripts or assigned to the BASEX_JVM environment variable before starting BaseX.
• You can use our internal BaseX XML parser, either by enabling the INTPARSE option, or by switching to the »Parsing« tab in the »Create Database« dialog of the GUI and activating the corresponding checkbox.
In a future version of BaseX, we may introduce a global option to invalidate the limits. As BaseX is a tool for XML experts, we could also invalidate the Java options by default. Feedback from everyone is welcome.
Best, Christian
[1] https://docs.oracle.com/en/java/javase/25/docs/api/java.xml/module-summary.h...
------------------------------
*Von:* Graydon Saunders via BaseX-Talk <basex-talk@mailman.uni-konstanz.de
*Gesendet:* Dienstag, 6. Januar 2026 18:11 *An:* BaseX <basex-talk@mailman.uni-konstanz.de> *Betreff:* [basex-talk] the right way to respond to JAXP00010001
Hello (and Happy New Year!)
I'm on Linux (Fedora) using BaseX 12.1 and OpenJDK Runtime Environment (Red_Hat-25.0.1.0.8-3) (build 25.0.1+8)
I've got some data, which I want BaseX to load by the individual document using the doc() function.
On one of these documents I get a parsing failure that reports JAXP00010001; if I look that up, I find https://www.oracle.com/java/technologies/javase/24-relnote-issues.html which says this limit changed (from a traditional larger number) to 2500, so now the error text is
JAXP00010001: The parser has encountered more than "2500" entity expansions in this document; this is the limit imposed by the JDK
It's a large file and I can't do anything about that part, nor can I do anything about the number of entity references these files happen to have when I get them. (In this particular case, a bit more than five thousand.) The Oracle page lists a bunch of options for how to set a different entity expansion limit.
In context of BaseX, what's the right way to adjust the entity expansion limit (in my case, generally the BaseX GUI) so these files will parse?
Thanks! Graydon
-- ...Wendell Piez... ...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org... ... github.com/wendellpiez...
-- ...Wendell Piez... ...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org... ...github.com/wendellpiez. ..
participants (3)
-
Christian Grün -
Graydon Saunders -
Wendell Piez