Hi Christian,
Hi Kristian,
- each used data set in my query as a separate BaseX database instance
…as long as the databases are on different disks/drives.
so that the computations would thus run separately (in parallel) in each of the instances?
What kind of computations do you want to perform?
Mainly finding subsets of the XML data stored: retrieve a list of "something" that matches some criteria; then use the items in this list for retrieving further subsets of related data from other data sets. In my case finding the related data for each of the item in the first list could be done in parallel.
What exactly is referred to by you saying "Java"? Is it Java proper or is it something else that is run on the java virtual machine, like Scala?
I must admit I don’t know very much about the internals of the JVM. I just noticed that the evaluation of some XQuery expressions leads to the utilization of multiple CPU cores with Java 7 or 8, and I didn’t encounter this behaviour with Java 6. It may be that this is due to the JIT compiler, but it could also be that some (very safe) computations are done in parallel.
This is interesting but I don't know too much about it either. To be clear, BaseX is not doing any parallelization effort of the XQuery execution plans? And the main reason is because of the random access patterns of (single) disk access?
Best regards Kristian