Hi Vincent,
can you show us your catalog?
Since you mention that it chokes on finding the DTD, it might be that you need rewriteSystem instead of rewriteURI for the DTD locations.
Also if you don't resolve by public ID and refer to the DTD by relative path, this path will be made absolute before being catalog-resolved, so instead of <system> your can use <systemSuffix> in order to only match the tail of the path.
On the other hand, since you say it’s running with standalone Saxon, the same DTD resolution issues should be expected there.
It may well be the case that our efforts to make DTD resolution available to xslt:transform() only focused on supporting xsl:import and xsl:include while not passing the resolver to the doc() function.
Maybe Liam can investigate this in more depth. I then suggest to ask for a budget at Taylor & Francis. We paid Liam to explore and enable the use of catalogs for xsl:import and xsl:include, and he dug through the mess of different interfaces etc. successfully for this limited but important use case. So he is *the* expert in this field and I’d like to warmly recommend paying him so that he can explore and fix the issue.
Gerrit
On 10.07.2020 07:55, Lizzi, Vincent wrote:
Hi Liam,
Thanks for the helpful suggestions. After trying everything you suggested and then also trying a few of Saxon’s configuration options, unfortunately I’m still having the same problem. Trying a shell script that contains the following:
MAIN="$( cd -P "$(dirname "$FILE")/../basex" && pwd )"
CP=$MAIN/BaseX.jar:$MAIN/lib/custom/*:$MAIN/lib/*:$CLASSPATH
echo 1 Saxon
java -cp "$CP" net.sf.saxon.Transform -s:input1.xml -xsl:transform.xsl -catalog:schemas/catalog.xml
echo 2 BaseX transform
java -cp "$CP" org.basex.BaseX -q"(# db:catfile schemas/catalog.xml #) (# db:intparse false #) (# db:dtd true #) (# db:chop false #) { xslt:transform('input1.xml', 'transform.xsl') }"
echo 3 BaseX transform with Saxon features configured
java -Dhttp://saxon.sf.net/feature/entityResolverClass=org.apache.xml.resolver.tool... -Dhttp://saxon.sf.net/feature/uriResolverClass=org.apache.xml.resolver.tools.C... -cp "$CP" org.basex.BaseX -q"(# db:catfile schemas/catalog.xml #) (# db:intparse false #) (# db:dtd true #) (# db:chop false #) { xslt:transform('input1.xml', 'transform.xsl') }"
echo 4 BaseX doc to show XML Catalog is configured correctly to parse XML
java -cp "$CP" org.basex.BaseX -q"(# db:catfile schemas/catalog.xml #) (# db:intparse false #) (# db:dtd true #) (# db:chop false #) { doc('input1.xml') }"
The classpath includes BaseX 9.3.3, Saxon HE 9.9, xml-resolver-1.2.jar, and CatalogManager.properties
- The transformation works in Saxon and uses the catalog file to locate the DTD when parsing the XML input1.xml.
- The BaseX xslt:transform should work the same as #1, but fails because the DTD cannot be read
- Adding Saxon configuration for Entity Resolver Class and URI Resolve Class did not help
- Simply parsing the XML using doc() in BaseX with the same configuration shows that the XML catalog is configured correctly within BaseX
Using strace -f, the log shows that BaseX xslt:transform is reading the catalog.xml file from disk, and then is trying (and failing) to read the DTD from the non-working URIL.
This might be a bug in xslt:transform, so the workaround of using a regular expression replace on the DOCTYPE system URI is probably the practical solution.
Many thanks,
Vincent
*Vincent M. Lizzi*
Head of Information Standards | Taylor & Francis Group
vincent.lizzi@taylorandfrancis.com mailto:vincent.lizzi@taylorandfrancis.com
Information Classification: General
*From:* Liam R. E. Quin liam@fromoldbooks.org *Sent:* Thursday, July 9, 2020 12:55 PM *To:* Lizzi, Vincent Vincent.Lizzi@taylorandfrancis.com; BaseX basex-talk@mailman.uni-konstanz.de *Subject:* Re: [basex-talk] xslt:transform function not working with XML Catalog
On Thu, 2020-07-09 at 04:32 +0000, Lizzi, Vincent wrote:
Hi Liam,
Thanks for the reply and suggestions. Based on your suggestion I tried pragmas and strace, and had another go at CatalogManager.properties, but they've not had any effect.
use, strace -f java.... >& hugelogfile.txt and after, grep -i catalogmanager.properties hugelogfile.txt and you should see where it's looking. If it doesn't look for that file, check to see if it opened the jar file containing the resolver.
If you're running BaseX from Oxygen, Oxygen needs to have it in its classpath too i think.
Also, of course, see if the catalog file is actually being opened!
I actually wrote some of the code in BaseX that makes XML catalogs work with transform(), or provided a rough draft that Christian improved :), and debugging it was... interesting.
I'd also try an absolute path for the catalog file - if you are using the BaseX server, relative paths will be relative to the directory (folder) where the server itself is running. (and of course the server needs the resolver in its classpath).
Messages from the catalog manager seem to go (oddly) to standard output interleaved with any XML output.
The command-line i used for testing this (well, one of the tests) was,
R=$HOME/lib/xmlcatalog/xml-commons-resolver-1.2/resolver.jar MAIN=$HOME/packages/basex/basex
java -Dxml.catalog.files=saxlog.xml -D' http://saxon.sf.net/feature/uriResolverClass=org.apache.xml.resolver.tools.C...' -cp $R/resolver.jar:/home/lee/packages/basex/basex/BaseX.jar:$MAIN/lib/cust om/*:$MAIN/lib/*: org.basex.BaseX try.xq
(Saxon was in $MAIN)
-- Liam Quin, https://www.delightfulcomputing.com/ https://www.delightfulcomputing.com Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Barefoot Web-slave, antique illustrations: http://www.fromoldbooks.org