Hi Javier,
Thanks for your mail.
It's currently not possible to directly access the position information that is internally used for computing the results. The reasons are manifold:
* The positions do not reflect the actual substring anymore. Instead, we enumerate all tokens that remain after normalizing the input (i.e., after the removal of stopwords, stemming, etc.). So, in practice, it is difficult to assign those positional information to the original input.
* The positions can stretch over several elements (for example, the following query yields true: <x>X<y/>Z</x> contains text "XZ")
* The data structures containing the positions can potentially consume lots of space, so they are usually discarded after the result is returned.
What would you like to do with the information? Maybe you have seen the ft:mark and ft:extract functions; are they helpful a bit?
Christian
[1] http://docs.basex.org/wiki/Full-Text_Module#ft:mark
On Wed, Nov 26, 2014 at 12:35 PM, Javier Couto javier.couto.fr@gmail.com wrote:
Hi,
Sorry if this is too basic, but I’m trying to get the positions of the matched tokens in a full-text query, and I can’t find the way to do it. I imagine something like:
for $sentence in //sentence where $sentence[text() contains text { ‘DNA', ‘oxidation' }] return <positions>ft:SOME-FUNCTION-FOR-TOKENS-POSITIONS($sentence[text() contains text { ‘DNA', ‘oxidation' }])</positions>
Is this possible?
Thank you in advance,
Javier