Hi George,
As you can imagine, it’s pretty hard to give general advice on this, because (as you know) XQuery can be used for creating all kinds of complex code.
When building complex applications, we tend to start with small problems, and proceed further after we have understood all its implications. Maybe you need to apply some divide and conquer principles to your code to find out which parts cause the problems. Some more hints:
• prof:time can be used to do some basic performance profiling. Please note that the inclusion of such a non-deterministic call may influence the compilation of your query, but in most cases, it should help you to exclude code that is always fast.
• You have already mentioned options such as TAILCALLS and INLINELIMIT. Assigning very small and large values will tell you if function inlining or TCOs do matter or not.
• Sometimes it helps to use Java debugging (e.g. with the JVM flag -Xrunhprof:cpu=samples,depth=50). If you are unsure how to interpret the resulting text file, feel free to pass on the beginning and the end of the text file to the list.
p.s I realized BaseX could have a very neat feature for the GUI, to have the option to only compile or compile + run an XQuery file. (if it doesn't already exist)
The RUNQUERY option might help you out [1].
Cheers, Christian
[1] http://docs.basex.org/wiki/Options#RUNQUERY
On Wed, Mar 8, 2017 at 3:45 PM, George Sofianos gsf.greece@gmail.com wrote:
I'm having a very difficult issue to resolve. I have an XQuery file with 6325 lines that does very complex calculations / validations / etc. When I'm running this script on an XML file of about 60MB, it takes about 2 hours to finish. I'm trying to find ways to debug this, and change the code where necessary so it will run faster and on larger files. I noticed that when I'm using TAILCALLS = -1 I need more than 4 MB stack size. (I've increased it to 100MB since it's just one thread anyway)
I'm trying to find out what I can improve in the code, but I can't understand how yet. I've used a profiler (yourkit) to see if I can get more information, but I'm not very experienced with profilers and I don't think any information from the profiler can help me fix the script. Compilation takes about 16 seconds with INLINELIMIT = 0 and MAINMEM = true, so the issue is in execution.
I'm wondering if multiple function calls that are not tail calls can create this issue, or maybe I need to change tail calls to tail recursion calls, and keep a TAILCALLS = 256. Some hints on how to improve performance here would be very welcome.
p.s I realized BaseX could have a very neat feature for the GUI, to have the option to only compile or compile + run an XQuery file. (if it doesn't already exist)
On 12/21/2016 03:56 PM, Christian Grün wrote:
I'm curious, what value do you recommend here? I've been using it for BaseX with -Xss4m for a long time, but I'm sure that is too much.
Hm, good question ;) I think that the Java default setting (…which also depends on your system configuration) is usually the best tradeoff. In our own apps, we usually rewrite our XQuery code such that there is no need for this flag (mostly because of convenience, to ensure that it runs out-of-the-box when changing the system). If you don’t experience any bottlenecks with 4m that you don’t encounter with a smaller value, it’s probably a good choice.
Thanks, Christian