Hi Xavier,
The reason for the memory usage might be an internal cache that is created for speeding up the execution time of your query. Out of interest:
1. Did the problem persist with Graydon’s alternative (and the standard of 2 GB RAM)?
let $a as xs:string+ := /thesaurus/entry/synonym/term/string() let $b as xs:string+ := /thesaurus/entry/term/string() return $a[not(. = $b)]
2. If yes, can you successfully run the following query?
let $b := /thesaurus/entry/term return /thesaurus/entry/synonym/term[not(.=$b)]
If yes, how many thesaurus terms are stored in your database?
count(/thesaurus/entry/term)
Best, Christian
On Fri, Jan 25, 2019 at 9:01 AM Xavier-Laurent SALVADOR xavierlaurent.salvador@gmail.com wrote:
Hi,
I checked it during night: same symptom. And I realized I just downloaded a new 9.1.2 Basex on my good old and full hard drive while using this param: BASEX_JVM="-Xmx2g $BASEX_JVM"... I moved it to an external drive, and ... shazam ! Hardware problem, so. Thanks a lot !
Xavier-Laurent Salvador
Le jeu. 24 janv. 2019 à 22:10, Xavier-Laurent SALVADOR xavierlaurent.salvador@gmail.com a écrit :
Awesome, thanks!
Le jeu. 24 janv. 2019 à 21:52, Graydon graydonish@gmail.com a écrit :
On Thu, Jan 24, 2019 at 09:41:18PM +0100, Xavier-Laurent SALVADOR scripsit:
Hi List,
I'm seing a little problem I can't understand with a small 27M thesaurus database. I created all indexes. When using the '!=' operator to compare two lists, I get a quick and wrong result:
*Case 1:* let $a := /thesaurus/entry/synonym/term let $b := /thesaurus/entry/term return $a[. = $b] ---> First result: "raccourcir"
The = and != operators compare the sequences.
= returns true if _any_ member of the left-hand sequence can be found in the right-hand sequence.
('1','2','3') = ('1','asparagus','guillotine')
returns true().
Similarly, != returns true if _any_ member of the left-hand sequence can NOT be found in the right-hand sequence.
('1','2','3') != ('1','asparagus','guillotine')
returns true().
!= is an exceedingly tricksy operator;
(1,2,3) != (1,2,3)
is true.
*Case 4:* let $a := /thesaurus/entry/synonym/term let $b := /thesaurus/entry/term return $a[not(.=$b)]
It dies out of memory.
That's odd. That pattern generally works pretty well for me on much larger datasets than 27 MB.
If you're trying to check that all your synonyms are terms, I'd try
let $a as xs:string+ := /thesaurus/entry/synonym/term/string() let $b as xs:string+ := /thesaurus/entry/term/string()
return $a[not(. = $b)]
so you do the "I want the string value of this element" only once, or if for some reason you need the elements later, maybe create maps and then compare the map keys?
-- Graydon
-- Xavier-Laurent Salvador Professeur Agrégé, Maître de Conférence HDR ECC TTN 2018 22868H - équipe "Humanités Numériques" Coordinateur du réseau international HiBHidEM http://ttn.univ-paris13.fr Université Paris 13 Sorbonne Paris Cité 99 avenue Jean-Baptiste Clément 93430 Villetaneuse tél. : (+33) 06 51.65.84.38 email : xavier-laurent.salvador@univ-paris13.fr site web: http://www.biblehistoriale.fr site web: http://www.humanitesnumeriques.fr
Ce message peut contenir des informations réservées exclusivement à son destinataire. Toute diffusion sans autorisation est interdite. Si vous n'en êtes pas le destinataire, merci de prendre contact avec l'expéditeur et de détruire ce message.
This email may contain material for the sole use of the intended recipient. Any forwarding without express permission is prohibited. If you are not the intended recipient, please contact the sender and delete all copies.