Hi all,
I’m running some scripts to solve a NP-problem in order to check trees compatibility and i’m using Basex as a database for my hierarchy structure, indeed for each path/each node i run a query on the basexserver.
This brings to have for a tree with 7 paths a 7! permutations and therefore at least 7!*4=21.000 queries (the query below) in a lapse of 5minutes.
If i use the db in my localhost environment i see that basexserver goes over 150% of cpu usage. I was wondering how to increase the performance without causing the server crash.
basex is running on a 2,6 GHz Intel Core i5 with 16gb of Ram, OS X 10.10.5
basex version is 8.2
Here the code i use:
public class BaseXOntologyManager{
private final int PORT; private final String PASSWORD; private final String USERNAME; private final String ADDRESS; private final String DBNAME;
/** * Constructor of the Basex ontology Manager * @param address The address of the DB * @param port The port to use to connect to the DB * @param username The username to use to connect to the DB * @param password The password to use to connect to the DB * @param dbName The db Name */ public BaseXOntologyManager(String address, int port, String username, String password, String dbName){//,boolean connected) { this.ADDRESS = address; this.PORT = port; this.USERNAME = username; this.PASSWORD = password; this.DBNAME = dbName; } private BaseXClient getOpenConnection() { try { BaseXClient session = new BaseXClient(this.ADDRESS, this.PORT, this.USERNAME, this.PASSWORD); return session; } catch (IOException e) { return null; } } /** * Return the subClasses of the element based on the ontology in the DB, * if the direct is true returns only the parents otherwise also the descendants * @param keyName The name of the Ontology to use * @param element the element to search in the param * @param direct true if you want only the parents * @return A list of subClasses of the ontology * @throws Exception when element doesn't belong to Ontology */ public ArrayList<String> getSubClasses(String keyName, String element, boolean direct) throws Exception { BaseXClient client = this.getOpenConnection(); if (client != null) { if(!checkExistence(element)) throw new Exception("notFound"+element); ArrayList<String> result = new ArrayList<String>(); String input = null;
if (direct == true) { input = "for $n in db:open('" + this.DBNAME +"', '/" + keyName + "')//*[parent::" + element + "] return $n/name()"; } else { input = "for $n in db:open('" + this.DBNAME +"', '/" + keyName + "')//*[ancestor::" + element + "] return $n/name()"; } try { final BaseXClient.Query query = client.query(input); while (query.more()) { result.add(query.next()); } query.close(); result.add(element);
return result;
} catch (IOException e) { return null; } finally { try { client.close(); } catch (Exception e) { // TODO Auto-generated catch block e.printStackTrace(); } } }else{ System.out.println("CLIENT NULL"); } return null; }
Thanks, Filippo
Hi Filippo,
Your queries may be executed faster if you rewrite your query to forward axes:
if(direct) { input = "for $n in db:open('" + DBNAME +"', '/" + keyName + "')/descendant::" + element + "/child::*/name()" } else { input = "for $n in db:open('" + DBNAME +"', '/" + keyName + "')//descendant::" + element + "/descendant::*/name()" }
This brings to have for a tree with 7 paths a 7! permutations and therefore at least 7!*4=21.000 queries (the query below) in a lapse of 5minutes.
XQuery is a very powerful language, so I’m more than sure that the problem could be solved much faster if all queries were combined in a single XQuery expression. However, it obviously takes some time to get to know the language..
Hope this helps, Christian
Hi Christian, Thanks for you help, using a singleton structure for the client and using your query, my performances improved a lot. (from 120min to 40s for the whole process)
The suggestion for anyone else having this issue is as always to keep the session till all the requests are done, and not to create a session for every query.
Cheers, F.
Il giorno 23/nov/2015, alle ore 12:06, Christian Grün christian.gruen@gmail.com ha scritto:
Hi Filippo,
Your queries may be executed faster if you rewrite your query to forward axes:
if(direct) { input = "for $n in db:open('" + DBNAME +"', '/" + keyName + "')/descendant::" + element + "/child::*/name()" } else { input = "for $n in db:open('" + DBNAME +"', '/" + keyName + "')//descendant::" + element + "/descendant::*/name()" }
This brings to have for a tree with 7 paths a 7! permutations and therefore at least 7!*4=21.000 queries (the query below) in a lapse of 5minutes.
XQuery is a very powerful language, so I’m more than sure that the problem could be solved much faster if all queries were combined in a single XQuery expression. However, it obviously takes some time to get to know the language..
Hope this helps, Christian
basex-talk@mailman.uni-konstanz.de