Hi Martin,
I must confess I didn't check all the details in your mail, and I haven't stumbled across something like SIGPIPE errors before, but I would be interested to hear if the problem als occurs...
a) with a single thread, or b) if you don't reuse existing sessions?
Thanks in advance, Christian
On Wed, Aug 5, 2015 at 10:08 AM, Martin mar@centrum.cz wrote:
Hi,
I am having difficulties with populating BASEX database. I have plenty of XML files (~ half a million, with various sizes ranging from several kilobytes up to hundred of kilobytes).
I use BASEX Java API and finally I call for each file org.basex.core.cmd.Add.
I am trying to import them into BASEX database, in fact there are 22 types of files (22 XSD definitions) the files conform to, so I have 22 different databases in a single BASEX server.
I have plenty of RAM and CPU power and I monitor the process (both -- the BASEX server and my client program) from within JVisualVM, the JVM reaches the CPU boundaries, but RAM is never exhausted.
Before importing, I need to enhance the XML data with some additional information taken from SQL database.
I have written a Groovy multithreaded program that uses BASEX Java API with heavy use of GPars library. Simply put, the program:
- has several producer threads -- each producer reads given portion of the
database and provides those additional information
- has several consumer threads -- each consumer takes the original files,
wraps it with additional information and finally calls org.basex.core.cmd.Add command.
Various testing with less data (upto ~ several thousands of files) provides good results -- no loss of data, BASEX server and my client program behaves as it should.
Unfortunately when trying to import all of the files, the program starts fine, but when it gets "warm" I got SIGPIPE errors in log from time to time (as I said, there is plenty of RAM and CPU available) (see attachment please).
Comments to picture:
- I am adding document with ID ISPOP_166007 -- this ID is indeed missing
in the final database
just simple call to Add:
Closure add = { session -> def cmd = new org.basex.core.cmd.Add(dsn, enhancedXml) session.execute(cmd) }
I am reusing the session, the session is bound to current thread and
never gets closed until the thread (consumer) finishes
There is nothing wrong in BASEX server log, other documents are added just fine, there is no trace about document ISPOP_166007.
Just for reference the complete stack trace follows:
ERROR basex.support.AddResourcesSupport - unable to consume ISPOP_166007 java.net.SocketException: Roura přerušena (SIGPIPE) at java.net.SocketOutputStream.socketWrite0(Native Method) ~[na:1.8.0_45] at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109) ~[na:1.8.0_45] at java.net.SocketOutputStream.write(SocketOutputStream.java:153) ~[na:1.8.0_45] at org.basex.io.out.BufferOutput.flush(BufferOutput.java:60) ~[basex-8.2.jar!/:8.2] at org.basex.io.out.BufferOutput.write(BufferOutput.java:54) ~[basex-8.2.jar!/:8.2] at org.basex.io.out.PrintOutput.write(PrintOutput.java:66) ~[basex-8.2.jar!/:8.2] at java.io.OutputStream.write(OutputStream.java:116) ~[na:1.8.0_45] at java.io.OutputStream.write(OutputStream.java:75) ~[na:1.8.0_45] at org.basex.api.client.ClientSession.send(ClientSession.java:238) ~[basex-8.2.jar!/:8.2] at org.basex.api.client.ClientSession.execute(ClientSession.java:160) ~[basex-8.2.jar!/:8.2] at org.basex.api.client.ClientSession.execute(ClientSession.java:167) ~[basex-8.2.jar!/:8.2] at org.basex.api.client.Session.execute(Session.java:36) ~[basex-8.2.jar!/:8.2] at org.basex.api.client.Session$execute.call(Unknown Source) ~[na:na] at basex.support.AddResourcesSupport$_consume_closure9$_closure17.doCall(AddResourcesSupport.groovy:255) ~[basex-1.0.jar!/:na] at sun.reflect.GeneratedMethodAccessor368.invoke(Unknown Source) ~[na:na] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_45] at java.lang.reflect.Method.invoke(Method.java:497) ~[na:1.8.0_45] at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93) [groovy-2.4.4.jar!/:2.4.4] at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325) [groovy-2.4.4.jar!/:2.4.4] at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:294) [groovy-2.4.4.jar!/:2.4.4] at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1019) [groovy-2.4.4.jar!/:2.4.4] at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:42) [groovy-2.4.4.jar!/:2.4.4] at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125) [groovy-2.4.4.jar!/:2.4.4] at basex.BasexSessionRegistry.withThreadBoundSession(BasexSessionRegistry.groovy:79) ~[basex-1.0.jar!/:na] at basex.BasexSessionRegistry$withThreadBoundSession$0.call(Unknown Source) ~[na:na] at basex.support.AddResourcesSupport$_consume_closure9.doCall(AddResourcesSupport.groovy:257) ~[basex-1.0.jar!/:na] at sun.reflect.GeneratedMethodAccessor327.invoke(Unknown Source) ~[na:na] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_45] at java.lang.reflect.Method.invoke(Method.java:497) ~[na:1.8.0_45] at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93) [groovy-2.4.4.jar!/:2.4.4] at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325) [groovy-2.4.4.jar!/:2.4.4] at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:294) [groovy-2.4.4.jar!/:2.4.4] at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1019) [groovy-2.4.4.jar!/:2.4.4] at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:42) [groovy-2.4.4.jar!/:2.4.4] at org.codehaus.groovy.runtime.callsite.BooleanReturningMethodInvoker.invoke(BooleanReturningMethodInvoker.java:51) ~[groovy-2.4.4.jar!/:2.4.4] at org.codehaus.groovy.runtime.callsite.BooleanClosureWrapper.call(BooleanClosureWrapper.java:53) ~[groovy-2.4.4.jar!/:2.4.4] at org.codehaus.groovy.runtime.DefaultGroovyMethods.find(DefaultGroovyMethods.java:3908) ~[groovy-2.4.4.jar!/:2.4.4] at org.codehaus.groovy.runtime.dgm$191.invoke(Unknown Source) ~[groovy-2.4.4.jar!/:2.4.4] at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274) ~[groovy-2.4.4.jar!/:2.4.4] at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56) ~[groovy-2.4.4.jar!/:2.4.4] at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125) [groovy-2.4.4.jar!/:2.4.4] at basex.support.AddResourcesSupport.consume(AddResourcesSupport.groovy:251) ~[basex-1.0.jar!/:na]
Best Regards, Martin