Re: [basex-talk] Position of matched tokens in a full-text query

27 Nov 2014


      Hi Christian,
Yes, it helps, tank you! I will try this approach. Two last questions:
1. The ft:tokenize function tokenizes on-the-fly or tokens are stored in the full text index ? It seems that they are stored for the whole document, but for each text element ? I’m wondering if I can speed up performance if I pre-compute, for each sentence, its tokenized version and store it in the database.
2. I guess that if I search something like  { “DNA", “oxidation” }, I need to compute the distance for each term using index-of, isn’t it ?
Best,
Javier
El 26/11/2014, a las 16:18, Christian Grün christian.gruen@gmail.com escribió:
...
Hi Javier,
One function you could try is ft:tokenize. Please have a look at the attached example.
Hope this helps?
Christian
________________________________________
let $term := ft:tokenize('DNA')
for $sentence in <sentences>
    <sentence id="1.1.122.1.122">The translated protein showed weak DNA binding with a specificity for the kappa B binding motif.</sentence>
    <sentence id="54.1.5.1.698">Using this assay system, we have evaluated the contributions of ligand binding and heat activation to DNA binding by these glucocorticoid receptors.</sentence>
    <sentence id="2.1.17.1.79">2.5 Mesocosm DNA extraction and purification</sentence>
</sentences>/sentence
order by index-of(ft:tokenize($sentence), $term)[1]
return $sentence

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [basex-talk] Position of matched tokens in a full-text query