I've just installed BaseX and started using it. I just wonder if the
following query is possible with BaseX.
I have a corpus in an XML format, where <s> means 'sentence', <w> means 'word'.
<s n="6"><w c5="PNP" hw="she" pos="PRON">She </w><w c5="VVD" hw="say" pos="VERB">said
</w><w c5="PNP" hw="she" pos="PRON">she </w><w c5="VVD" hw="go" pos="VERB">went </w>
<w c5="TO0" hw="to" pos="PREP">to </w><w c5="VVI" hw="buy" pos="VERB">buy </w>
<w c5="PNI" hw="something" pos="PRON">something </w><unclear/>
<w c5="PNX" hw="herself" pos="PRON">herself</w><c c5="PUN">, </c>...</s>
If you create a BaseX database with an option of full-text, you can
extract sentences in which the word A and the word B" appear
within a designated number of words. For example, the following query
will extract the sentence above:
//s[. contains text 'buy' ftand 'herself' window 5 words]
So this is my question: Is there a way to extract all the sentences in
which a word of a particular part of speech (for example, a verb)
and another word appear within a designated number of words,
like any verb and "herself" appear within a window of 5 words.
Thank you in advance.
Best,
Sam, A.