Awesome, thanks!--Le jeu. 24 janv. 2019 à 21:52, Graydon <graydonish@gmail.com> a écrit :On Thu, Jan 24, 2019 at 09:41:18PM +0100, Xavier-Laurent SALVADOR scripsit:
> Hi List,
>
> I'm seing a little problem I can't understand with a small 27M thesaurus
> database. I created all indexes.
> When using the '!=' operator to compare two lists, I get a quick and wrong
> result:
> -------
> *Case 1:*
> let $a := /thesaurus/entry/synonym/term
> let $b := /thesaurus/entry/term
> return $a[. = $b]
> ---> First result: "raccourcir"
The = and != operators compare the sequences.
= returns true if _any_ member of the left-hand sequence can be found in
the right-hand sequence.
('1','2','3') = ('1','asparagus','guillotine')
returns true().
Similarly, != returns true if _any_ member of the left-hand sequence can
NOT be found in the right-hand sequence.
('1','2','3') != ('1','asparagus','guillotine')
returns true().
!= is an exceedingly tricksy operator;
(1,2,3) != (1,2,3)
is true.
> *Case 4:*
> let $a := /thesaurus/entry/synonym/term
> let $b := /thesaurus/entry/term
> return $a[not(.=$b)]
> -------
>
> It dies out of memory.
That's odd. That pattern generally works pretty well for me on much
larger datasets than 27 MB.
If you're trying to check that all your synonyms are terms, I'd try
let $a as xs:string+ := /thesaurus/entry/synonym/term/string()
let $b as xs:string+ := /thesaurus/entry/term/string()
return $a[not(. = $b)]
so you do the "I want the string value of this element" only once, or if
for some reason you need the elements later, maybe create maps and then
compare the map keys?
-- Graydon
Xavier-Laurent SalvadorProfesseur Agrégé, Maître de Conférence HDRECC TTN 2018 22868H - équipe "Humanités Numériques"Coordinateur du réseau international HiBHidEMUniversité Paris 13 Sorbonne Paris Cité99 avenue Jean-Baptiste Clément93430 Villetaneusetél. : (+33) 06 51.65.84.38site web: http://www.biblehistoriale.frsite web: http://www.humanitesnumeriques.fr------------------------------------------------------------Ce message peut contenir des informations réservées exclusivement à son destinataire. Toute diffusion sans autorisation est interdite. Si vous n'en êtes pas le destinataire, merci de prendre contact avec l'expéditeur et de détruire ce message.
This email may contain material for the sole use of the intended recipient. Any forwarding without express permission is prohibited. If you are not the intended recipient, please contact the sender and delete all copies.