I'm deeply impressed... When do you do all this stuff? :)
--
Rupert Jung
<pagina> GmbH
Gesamtherstellung wissenschaftlicher Werke
Herrenberger Str. 51
D-72070 Tübingen
Handelsregister Stuttgart HRB 380249
Geschäftsführer: Tobias Ott
E-Mail: rupert.jung@pagina-tuebingen.de
Phone: (0 70 71) 98 76-37
Fax: (0 70 71) 98 76-22
http://www.pagina-online.de
-----Ursprüngliche Nachricht-----
Von: Christian Grün [mailto:christian.gruen@gmail.com]
Gesendet: Dienstag, 29. März 2011 19:49
An: rupert.jung@pagina-tuebingen.de
Cc: Andreas Weiler; bjoern.duenckel@pagina-tuebingen.de;
basex-talk@mailman.uni-konstanz.de
Betreff: Re: [basex-talk] Out of memory
Hi Rupert,
nice to hear that! Btw, I've added a little optimizations for mixed
location paths (as the one containing the name() function after the
location steps), which should perform the discussed optimizations
on-the-fly:
http://files.basex.org/releases/latest/
It's recommended to switch to that latest snapshot anyway, as it fixes
some important index rewritings on collections, which have been
temporarily removed in Version 6.6.
More feedback is welcome,
Christian
____________________________
> Hello Christian!
>
> You're my personal hero! The query
> "dinstinct-values(/descendant::*[randnr]/name())" worked perfectly! 2 GB
of
> data analysed in 3 seconds, WOW!
>
>
> Greetings and thanks again!
> Rupert Jung
>
> --
> Rupert Jung
> <pagina> GmbH
> Gesamtherstellung wissenschaftlicher Werke
> Herrenberger Str. 51
> D-72070 Tübingen
> Handelsregister Stuttgart HRB 380249
> Geschäftsführer: Tobias Ott
>
> E-Mail: rupert.jung@pagina-tuebingen.de
> Phone: (0 70 71) 98 76-37
> Fax: (0 70 71) 98 76-22
>
>
http://www.pagina-online.de
>
>
> -----Ursprüngliche Nachricht-----
> Von: Christian Grün [mailto:christian.gruen@gmail.com]
> Gesendet: Dienstag, 29. März 2011 18:51
> An: rupert.jung@pagina-tuebingen.de
> Cc: Andreas Weiler; bjoern.duenckel@pagina-tuebingen.de;
> basex-talk@mailman.uni-konstanz.de
> Betreff: Re: [basex-talk] Out of memory
>
> Rupert,
>
> thanks for your observation. My assumption is that the (hidden)
> descendant-or-self step in your query causes a huge number of
> intermediary nodes, which are then reduced to a small result set. In
> other words, your query..
>
> dinstinct-values(//*[randnr]/name())
>
> ..equals the following query:
>
>
>
dinstinct-values(/descendant-or-self::node()/child::*[child::randnr]/name())
>
> There are several choices how to possibly speed up your query; please
> try e.g. to:
>
> 1. explicitly use the descendant step:
> dinstinct-values(/descendant::*[randnr]/name())
> 2. wrap the name function around the location path:
> dinstinct-values( name( //*[randnr] ))
> 3. directly address the randnr nodes and use a parent step:
> dinstinct-values( name( /descendant::randnr/.. ))
>
> We might add some optimizations to BaseX to automatize some of the
> proposed steps.
>
> If this doesn't help, feel free to give us more feedback.
>
> Best,
> Christian
> ___________________________
>
> On Tue, Mar 29, 2011 at 5:31 PM, Rupert jung
>
rupert.jung@pagina-tuebingen.de wrote:
>> Hi Andreas and thanks for your answer,
>>
>>
>>
>> unfortunely that didnt work (java.exe consumed 1.2 GB, then stopped).
The
>> expected result should not be longer then about 10 element names or so...
>>
>>
>>
>> Maybe this could be a bug in basex itself
?
>>
>>
>>
>>
>>
>> ________________________________
>>
>> Rupert Jung
>>
>> <pagina> GmbH
>> Gesamtherstellung wissenschaftlicher Werke
>> Herrenberger Str. 51
>> D-72070 Tübingen
>>
>> Handelsregister Stuttgart HRB 380249
>> Geschäftsführer: Tobias Ott
>>
>> Phone: (0 70 71) 98 76-37
>> Fax: (0 70 71) 98 76-22
>> E-Mail: rupert.jung@pagina-tuebingen.de
>>
http://www.pagina-online.de
>>
>>
>>
>> Von: Andreas Weiler [mailto:andreas.weiler@uni-konstanz.de]
>> Gesendet: Dienstag, 29. März 2011 16:25
>> An: rupert.jung@pagina-tuebingen.de
>> Cc: basex-talk@mailman.uni-konstanz.de;
> bjoern.duenckel@pagina-tuebingen.de
>> Betreff: Re: [basex-talk] Out of memory
>>
>>
>>
>> Hi,
>>
>>
>>
>> as first hint you could start BaseX with the Xmx flag of Java:
>>
>>
>>
>> java -cp BaseX.jar -Xmx1G org.basex.BaseXGUI
>>
>>
>>
>> Probably that will solve this issue.
>>
>>
>>
>> Kind regards,
>>
>> Andreas
>>
>>
>>
>> Am 29.03.2011 um 16:14 schrieb Rupert jung:
>>
>> Hi there,
>>
>>
>>
>> Im currently doing some tests with BaseX and a mid-sized database
(around
> 2
>> GB).
>>
>>
>>
>> I wonder myself why Im not able to process this xquery-statement:
>>
>>
>>
>> dinstinct-values(//*[randnr]/name())
>>
>> (Give me a list of all elements which have a child-element <randnr> and
>> remove all double entries)
>>
>>
>>
>> After about 10 seconds a got a out of main memory error. Whats really
>> strange about this: Processing the nodes itself with //*[randnr]/
>>
>> works like a charm (but gives me a HUGE amount of text and is not really
>> useful for me at all).
>>
>>
>>
>> My system: win7-x64, 4 GB RAM, Java 1.6.0_21
>>
>>
>>
>> Thank you in advance,
>>
>> Rupert Jung
>>
>>
>>
>> ________________________________
>>
>> Rupert Jung
>>
>> <pagina> GmbH
>> Gesamtherstellung wissenschaftlicher Werke
>> Herrenberger Str. 51
>> D-72070 Tübingen
>>
>> Handelsregister Stuttgart HRB 380249
>> Geschäftsführer: Tobias Ott
>>
>> Phone: (0 70 71) 98 76-37
>> Fax: (0 70 71) 98 76-22
>> E-Mail: rupert.jung@pagina-tuebingen.de
>>
http://www.pagina-online.de
>>
>>
>>
>> _______________________________________________
>> BaseX-Talk mailing list
>> BaseX-Talk@mailman.uni-konstanz.de
>>
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>>
>>
>>
>> _______________________________________________
>> BaseX-Talk mailing list
>> BaseX-Talk@mailman.uni-konstanz.de
>>
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>>
>>
>
>