One more reason to get 9.0.1 released soon ;)
On Thu, Apr 19, 2018 at 5:04 AM, Chuck Bearden cfbearden@gmail.com wrote:
Oh man, I overlooked that email. Thanks! I will give the latest snapshots a try. Chuck
On Wed, Apr 18, 2018 at 9:36 PM, Bridger Dyson-Smith bdysonsmith@gmail.com wrote:
Hi Chuck, Did you see the email from earlier today [1] regarding a Unicode bug? The latest development snapshot might address your issue (maybe! I'm not sure!).
HTH. Best, Bridger
[1] https://mailman.uni-konstanz.de/pipermail/basex-talk/2018-April/013071.html
On Wed, Apr 18, 2018, 9:18 PM Chuck Bearden cfbearden@gmail.com wrote:
Hi all,
I have loaded a large XML document (a dictionary of New Testament Greek) into its own database in BaseX 9.0. I'm using OpenJDK 1.8.0_162 under Ubuntu 16.04.4. I used the default Java parser, and I enabled the token & full text indices, but otherwise the database settings were the defaults.
When I execute a particular query, I'm getting a sequence of the character reference for the carriage return ('
') instead of some characters like the single & double daggers.
Here is some typical output:
<!--Page: 4 ; Entry: ἄγε|G33 -->
<entry n="ἄγε|G33"> <note type="occurrencesNT">2</note> <form> <orth>ἄγε</orth>,</form> <seg type="derivation">prop. imperat. of<ref> <foreign xml:lang="grc">ἄγω</foreign> </ref>,</seg> <sense> <gloss>come!</gloss>used as<gramGrp> <pos>adv.</pos> </gramGrp>and addressed, like<ref> <foreign xml:lang="grc">φέρε</foreign> </ref>, to one or more persons:<ref osisRef="Jas.4.13">Ja 4:13</ref>,<ref
osisRef="Jas.5.1">5:1</ref>











 </sense>
</entry>
The sequence of '
' char refs replace a dagger '†' (U+2020). I wonder if I'm doing something wrong, or if I've happened on a bug.
Here is my XQuery:
declare namespace tei = "http://www.crosswire.org/2013/TEIOSIS/namespace"; declare namespace xsi = "http://www.w3.org/2001/XMLSchema-instance";
let $coll := collection('abbott-smith.tei'), $elems := $coll//tei:seg[parent::tei:entry]
return <div xmlns="http://www.crosswire.org/2013/TEIOSIS/namespace" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">{ for $elem in $elems let $entry := $elem/parent::tei:entry, $entry_lemma := $entry/@n/data(), $page_num := $elem/preceding::tei:pb[1]/@n/data(), $comment_text := concat('Page: ', $page_num, ' ; Entry: ', $entry_lemma, ' ') return ( comment {$comment_text}, $entry ) }</div>
The source XML document is here:
https://github.com/translatable-exegetical-tools/Abbott-Smith/blob/master/ab...
Thanks in advance for your guidance.
All the best, Charles Bearden