Re: [basex-talk] how to formulate an excluding xquery?

27 Jun 2012


      Hi Christian,
Zitat von Christian Grün christian.gruen@gmail.com:
...
//*[text() contains text "A" ftand ftnot 'C']
Thanks, this seems to work. However, I encountered strange behavior,  
which is probably related to mixed content.
Given this document:
<doc>
<p>1 Ich fresse Dich mit Haut und Haar <pb/> und allem drum und dran.</p>
<p>2 Ich fresse Dich mit Haut und <pb/> Haar und allem drum und dran.</p>
<p>3 Ich fresse Dich mit Haut und Fell und allem drum und dran.</p>
<p>4 Ich fresse Dich mit Haut und Pelz und allem drum und dran.</p>
<p>5 Ich werde Dich mit Haut und Haar <pb/> und allem drum und dran  
fressen.</p>
<p>6 Du kannst mich mit Haut und Haar und allem drum und dran fressen.</p>
</doc>
from which I created a collection with whitespacechopping OFF,  
stemming for German ON. And then I run these queries:
(1) //*[text() contains text ("Haut" ftand "fressen") using stemming  
using language "de"]
(2) //*[text() contains text ("Haut" ftand "fressen" ftand ftnot  
"Haar") using stemming using language "de"]
(1) should return all <p>-nodes, but does not return 5
(2) should return 1, 3, and 4, but does return 2, 3, and 4.
Is it correct, that when looking into a node, only text _before_ any  
other node will be handled, i.e. fore the first <p> node, only until  
"Haar", for the second one only until "und" and for the fifth one only  
until "Haar".
So everything after another node included in a particular node will be  
ignored? As there are a lot of nodes like page-breakes or line-breakes  
(not including relevant text, but only rendering information) in  
TEI-documents, this is rather irritating. There is no way to search  
the whole text of a paragraph or line node.
Best regards
Cerstin
-- 
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mahlow@unibas.ch
Web: http://www.oldphras.net

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [basex-talk] how to formulate an excluding xquery?