One last question. The serializer has trouble with tags that contain underscores. For example, I get the following error:

    [bxerr:BXCS0002] CSV serializer: Invalid element name <study_id>.

When I remove the underscore the error goes away (I inject these tags in the result clause).

Underscores should be fine in XML node names (e.g., clinicaltrials.gov uses them extensively).

Is this issue related to the XQuery specification or unique to BaseX?

Thanks,
Ron

On October 28, 2015 at 12:59:57 PM, Ron Katriel (rkatriel@mdsol.com) wrote:

Christian,

That works. Thanks!

Ron

On October 28, 2015 at 12:49:06 PM, Christian Grün (christian.gruen@gmail.com) wrote:

I guess you'll simply have to use file:write-text instead of
file:write (which serializes texts with the default XML output method
[1]).

[1] http://www.w3.org/TR/xslt-xquery-serialization-31/


On Wed, Oct 28, 2015 at 5:42 PM, Ron Katriel <rkatriel@mdsol.com> wrote:
> Hi Christian,
>
> You can use the attached XML file to populate a database. I verified that it
> is sufficient to replicate the issue.
>
> Thanks,
> Ron
>
>
> On October 28, 2015 at 12:33:14 PM, Christian Grün
> (christian.gruen@gmail.com) wrote:
>
> Hi Ron,
>
> I don't have the CTGov database on my machine… Could you build us a
> little self-contained example?
>
> Thanks in advance,
> Christian
>
>
> On Wed, Oct 28, 2015 at 5:31 PM, Ron Katriel <rkatriel@mdsol.com> wrote:
>> Hi,
>>
>> When serializing XML to CSV, special characters (e.g., &amp;) are not
>> converted to their textual representations (e.g., ‘&’).
>>
>> For example, the code below outputs
>>
>> conditionid nctid condition
>> 1 NCT00130377 Cardiovascular System Diseases (&amp; [Cardiac])
>>
>> vs. the expected
>>
>> conditionid nctid condition
>> 1 NCT00130377 Cardiovascular System Diseases (& [Cardiac])
>>
>> This seems like a bug. Am I missing an option? I did not see anything
>> related in the documentation.
>>
>> Thanks,
>> Ron
>>
>>
>> let $options := map { 'lax': false(), 'quotes' : false(), 'separator' :
>> 'tab', 'header' : true() }
>>
>> return file:write('conditions.tsv',
>> csv:serialize(
>> <matches> {
>> for $article in db:open('CTGov')/clinical_study
>> where $article//nct_id = 'NCT00130377'
>> for $condition in $article/condition
>> count $c
>> return
>> <match>
>> <conditionid> { $c } </conditionid>,
>> <nctid> { normalize-space($article/id_info/nct_id/text()) }
>> </nctid>,
>> <condition> { normalize-space($condition/text()) } </condition>
>> </match> }
>> </matches>,
>> $options)
>> )
>>
>>