Re: [basex-talk] Should it be possible to declare a function in the client?

3 Mar 2020


      Op 02-03-2020 om 13:27 schreef Christian Grün:
...
Hi Ben,
Here is an alternative version that, as I believe, should match your
requirements better:
let $words := distinct-values(
    for $text in db:open('Incidents')/csv/record/INC_RM
    return ft:tokenize($text)
  )
  let $stopwords := db:open('Stopwords')/text/line
  let $result := $words[not(. = $stopwords)]
  return sort($result)
There is no need to remove nbsp substrings as they’ll never occur in
your input, and the ft:tokenize function will ensure that your input
(case, special characters, diacritics) will be normalized (see [1,2]
for more details). Using functx is perfectly valid; I only removed the
reference to make the code a bit shorter.
Hope this helps,
Christian
[1] http://docs.basex.org/wiki/Full-Text_Module#ft:tokenize
[2] http://docs.basex.org/wiki/Full-Text
Hi Christian,
Since my primary goal for this is moment is to see how basex/XQuery can
be used for full text analysis (and compare the results or needed
efforts with similar tasks in R), I am very glad that you brought the
fn:tokenize() function to my attention!
Ben
PS,
Just for fun, I created a repository with this tiny function:
declare function tidyTM:wordFreqs(
  $Words as xs:string*)
{
for $w in $Words
  let $f := $w
  group by $f
  order by count($w) descending
return ($f, count($w))
} ;
It took less than 10 minutes to create a repository and populate with
this function.
Creating a R-package takes much longer time!!!

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [basex-talk] Should it be possible to declare a function in the client?