Hi!
I'm trying to move a XSLT Transformation into XQuery. First step is Creating a db and trying to run the XSLT with XQuery. I have a small input file (50 MB) and a large input file (2GB) for testing. So I set up the db with the small input file, ran a parametrizised query calling the XSLT Transformation and everything went well. To be sure everything will work on the bigger db too, I tried to do the same, which fails - I had to allow for more memory which I did. (To be sure the stylesheet works with the big input file for the db I checked it (it takes 20 Minutes to produce a ~1 GB File).
This is the error I get:
Error: Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 8.6.7 Java: Oracle Corporation, 1.8.0_151 OS: Windows 10, amd64 Stack Trace: java.lang.NegativeArraySizeException at java.util.Arrays.copyOf(Unknown Source) at org.basex.io.out.ArrayOutput.write(ArrayOutput.java:25) at org.basex.io.out.PrintOutput.print(PrintOutput.java:76) at org.basex.io.out.NewlineOutput.print(NewlineOutput.java:33) at org.basex.io.serial.MarkupSerializer.printChar(MarkupSerializer.java:254) at org.basex.io.serial.MarkupSerializer.attribute(MarkupSerializer.java:131) at org.basex.io.serial.Serializer.node(Serializer.java:416) at org.basex.io.serial.Serializer.node(Serializer.java:158) at org.basex.io.serial.StandardSerializer.node(StandardSerializer.java:105) at org.basex.io.serial.AdaptiveSerializer.node(AdaptiveSerializer.java:75) at org.basex.io.serial.Serializer.serialize(Serializer.java:109) at org.basex.io.serial.AdaptiveSerializer.serialize(AdaptiveSerializer.java:66) at org.basex.query.value.Value.serialize(Value.java:222) at org.basex.query.value.Value.serialize(Value.java:205) at org.basex.query.func.xslt.XsltTransform.read(XsltTransform.java:75) at org.basex.query.func.xslt.XsltTransform.transform(XsltTransform.java:47) at org.basex.query.func.xslt.XsltTransform.item(XsltTransform.java:33) at org.basex.query.expr.ParseExpr.value(ParseExpr.java:65) at org.basex.query.QueryContext.value(QueryContext.java:406) at org.basex.query.expr.gflwor.Let$LetEval.next(Let.java:185) at org.basex.query.expr.gflwor.GFLWOR$1.next(GFLWOR.java:93) at org.basex.query.scope.MainModule$1.next(MainModule.java:125) at org.basex.query.QueryContext.cache(QueryContext.java:618) at org.basex.query.QueryProcessor.cache(QueryProcessor.java:112) at org.basex.core.cmd.AQuery.query(AQuery.java:86) at org.basex.core.cmd.XQuery.run(XQuery.java:22) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) at org.basex.gui.GUI.exec(GUI.java:474) at org.basex.gui.GUI.access$4(GUI.java:428) at org.basex.gui.GUI$6.run(GUI.java:416) Compiling: - pre-evaluate root() to document-node() - pre-evaluate map { "param1":"default", "param2":"products" } to map - pre-evaluate map { "method":"xml", "indent":"no", "omit-xml-declaration":"no" } to map - inline $path_2 - pre-evaluate concat("c:/work/xslt_to_xquery/", "xslt_post_processing_bmwp-2526.xsl") to xs:string - inline $style_3 Optimized Query: let $document_4 := xslt:transform(db:open-pre("big-xml",0)/*, "c:/work/xslt_to_xquery/i-know-which.xsl", map { "param1": "default", "param2": "products", ... }) return file:write(concat("c:/work/xslt_to_xquery/", db:name(.), "_out_", "default", "products", ".xml"), $document_4, map { "omit-xml-declaration": "no", "method": "xml", ... }) Query: declare variable $param1 external := "default"; declare variable $param2 external := "products"; let $path := 'c:/work/xslt_to_xquery/' let $style := concat($path, 'i-know-which.xsl') let $document := xslt:transform(/*, $style, map { "param1": $param1, "param2": $param2 }) return file:write(concat($path, db:name(.), '_out_', $param1, $param2, '.xml'), $document, map { "method": "xml", "indent": "no", "omit-xml-declaration":"no" }) Query plan: <QueryPlan compiled="true"> <GFLWOR> <Let> <Var name="$document" id="4"/> <XsltTransform name="transform(input,stylesheet[,params[,options]])"> <IterPath> <DBNode name="big-xml" pre="0"/> <IterStep axis="child" test="*"/> </IterPath> <Str value="c:/work/xslt_to_xquery/i-know-which.xsl" type="xs:string"/> <Map size="2"> <Str value="param1" type="xs:string"/> <Str value="default" type="xs:string"/> <Str value="param2" type="xs:string"/> <Str value="products" type="xs:string"/> </Map> </XsltTransform> </Let> <FileWrite name="write(path,data[,params])"> <FnConcat name="concat(atom1,atom2[,...])"> <Str value="c:/work/xslt_to_xquery/" type="xs:string"/> <DbName name="name(node)"> <ContextValue/> </DbName> <Str value="_out_" type="xs:string"/> <Str value="default" type="xs:string"/> <Str value="products" type="xs:string"/> <Str value=".xml" type="xs:string"/> </FnConcat> <VarRef> <Var name="$document" id="4"/> </VarRef> <Map size="3"> <Str value="omit-xml-declaration" type="xs:string"/> <Str value="no" type="xs:string"/> <Str value="method" type="xs:string"/> <Str value="xml" type="xs:string"/> <Str value="indent" type="xs:string"/> <Str value="no" type="xs:string"/> </Map> </FileWrite> </GFLWOR> </QueryPlan>
Please let me know which information you need. Many thanks
Steffie
Dear Steffie,
Sorry for letting you wait; I’m back again after a longer winter break.
As BaseX is no native XSLT Transformer, the memory limits for transforming data are much tighter than for XQuery expressions. XSLT results will be cached in memory before they will be passed on to the XSLT processor.
However, I tried to tweak the code a bit, and I have improved the error message. I invite you to check out the latest stable snapshot and see if it allows you to process your input [1]. If not, you could try to:
• Reduce the input to maybe 50% of the size in order to see if your XSLT processor (Java, Saxon) can handle this size of data.
A general note: If there is some chance for you to do some more pre-processing in XQuery, you will definitely get better performance and require less memory.
Hope this helps, Christian
[1] http://files.basex.org/releases/latest/
On Thu, Dec 28, 2017 at 4:05 PM, st.haupt@gmail.com wrote:
Hi!
I'm trying to move a XSLT Transformation into XQuery. First step is Creating a db and trying to run the XSLT with XQuery. I have a small input file (50 MB) and a large input file (2GB) for testing. So I set up the db with the small input file, ran a parametrizised query calling the XSLT Transformation and everything went well. To be sure everything will work on the bigger db too, I tried to do the same, which fails - I had to allow for more memory which I did. (To be sure the stylesheet works with the big input file for the db I checked it (it takes 20 Minutes to produce a ~1 GB File).
This is the error I get:
Error: Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 8.6.7 Java: Oracle Corporation, 1.8.0_151 OS: Windows 10, amd64 Stack Trace: java.lang.NegativeArraySizeException at java.util.Arrays.copyOf(Unknown Source) at org.basex.io.out.ArrayOutput.write(ArrayOutput.java:25) at org.basex.io.out.PrintOutput.print(PrintOutput.java:76) at org.basex.io.out.NewlineOutput.print(NewlineOutput.java:33) at org.basex.io.serial.MarkupSerializer.printChar(MarkupSerializer.java:254) at org.basex.io.serial.MarkupSerializer.attribute(MarkupSerializer.java:131) at org.basex.io.serial.Serializer.node(Serializer.java:416) at org.basex.io.serial.Serializer.node(Serializer.java:158) at org.basex.io.serial.StandardSerializer.node(StandardSerializer.java:105) at org.basex.io.serial.AdaptiveSerializer.node(AdaptiveSerializer.java:75) at org.basex.io.serial.Serializer.serialize(Serializer.java:109) at org.basex.io.serial.AdaptiveSerializer.serialize(AdaptiveSerializer.java:66) at org.basex.query.value.Value.serialize(Value.java:222) at org.basex.query.value.Value.serialize(Value.java:205) at org.basex.query.func.xslt.XsltTransform.read(XsltTransform.java:75) at org.basex.query.func.xslt.XsltTransform.transform(XsltTransform.java:47) at org.basex.query.func.xslt.XsltTransform.item(XsltTransform.java:33) at org.basex.query.expr.ParseExpr.value(ParseExpr.java:65) at org.basex.query.QueryContext.value(QueryContext.java:406) at org.basex.query.expr.gflwor.Let$LetEval.next(Let.java:185) at org.basex.query.expr.gflwor.GFLWOR$1.next(GFLWOR.java:93) at org.basex.query.scope.MainModule$1.next(MainModule.java:125) at org.basex.query.QueryContext.cache(QueryContext.java:618) at org.basex.query.QueryProcessor.cache(QueryProcessor.java:112) at org.basex.core.cmd.AQuery.query(AQuery.java:86) at org.basex.core.cmd.XQuery.run(XQuery.java:22) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) at org.basex.gui.GUI.exec(GUI.java:474) at org.basex.gui.GUI.access$4(GUI.java:428) at org.basex.gui.GUI$6.run(GUI.java:416) Compiling:
- pre-evaluate root() to document-node()
- pre-evaluate map { "param1":"default", "param2":"products" } to map
- pre-evaluate map { "method":"xml", "indent":"no",
"omit-xml-declaration":"no" } to map
- inline $path_2
- pre-evaluate concat("c:/work/xslt_to_xquery/",
"xslt_post_processing_bmwp-2526.xsl") to xs:string
- inline $style_3
Optimized Query: let $document_4 := xslt:transform(db:open-pre("big-xml",0)/*, "c:/work/xslt_to_xquery/i-know-which.xsl", map { "param1": "default", "param2": "products", ... }) return file:write(concat("c:/work/xslt_to_xquery/", db:name(.), "_out_", "default", "products", ".xml"), $document_4, map { "omit-xml-declaration": "no", "method": "xml", ... }) Query: declare variable $param1 external := "default"; declare variable $param2 external := "products"; let $path := 'c:/work/xslt_to_xquery/' let $style := concat($path, 'i-know-which.xsl') let $document := xslt:transform(/*, $style, map { "param1": $param1, "param2": $param2 }) return file:write(concat($path, db:name(.), '_out_', $param1, $param2, '.xml'), $document, map { "method": "xml", "indent": "no", "omit-xml-declaration":"no" }) Query plan:
<QueryPlan compiled="true"> <GFLWOR> <Let> <Var name="$document" id="4"/> <XsltTransform name="transform(input,stylesheet[,params[,options]])"> <IterPath> <DBNode name="big-xml" pre="0"/> <IterStep axis="child" test="*"/> </IterPath> <Str value="c:/work/xslt_to_xquery/i-know-which.xsl" type="xs:string"/> <Map size="2"> <Str value="param1" type="xs:string"/> <Str value="default" type="xs:string"/> <Str value="param2" type="xs:string"/> <Str value="products" type="xs:string"/> </Map> </XsltTransform> </Let> <FileWrite name="write(path,data[,params])"> <FnConcat name="concat(atom1,atom2[,...])"> <Str value="c:/work/xslt_to_xquery/" type="xs:string"/> <DbName name="name(node)"> <ContextValue/> </DbName> <Str value="_out_" type="xs:string"/> <Str value="default" type="xs:string"/> <Str value="products" type="xs:string"/> <Str value=".xml" type="xs:string"/> </FnConcat> <VarRef> <Var name="$document" id="4"/> </VarRef> <Map size="3"> <Str value="omit-xml-declaration" type="xs:string"/> <Str value="no" type="xs:string"/> <Str value="method" type="xs:string"/> <Str value="xml" type="xs:string"/> <Str value="indent" type="xs:string"/> <Str value="no" type="xs:string"/> </Map> </FileWrite> </GFLWOR> </QueryPlan>
Please let me know which information you need. Many thanks
Steffie
basex-talk@mailman.uni-konstanz.de