Re: [basex-talk] Serialization options

30 Nov 2017

      Hi Kendall,
Good point ;)
As left/right quote characters (and all the regional variants) could
indeed be proper input, it’s difficult to define generic rules that
work for all kind of data that is feeded into BaseX. I think the best
solution is to first retrieve the (let’s call it) CSV data as plain
text and get it regex’ed, based on the experiences with previous user
input. The result can then be converted via csv:parse.
Hope this helps,
Christian
On Thu, Nov 30, 2017 at 6:42 PM, Kendall Shaw kendall.shaw@workday.com wrote:
...
Hi,
Before people post CSV files to a web service, they bring a troop of Guerillas into a room and give them computer keyboards. After that the guerillas pound on the keyboards and then click send. So, the CSV files that arrive can have different separators between each other and they use Microsoft word, perhaps, to type CSV text sometimes, maybe, and so there will be files surrounded by what looks like double quotes but is actually the left and right quote characters and all sorts of other problems.
In some cases, an error other than generic failure is wanted so that a user can know roughly what the problem is. In other cases, differences might be better to resolve in the web service, e.g. semi-colon vs. comma.
Kendall
On 11/30/17, 9:30 AM, "Christian Grün" christian.gruen@gmail.com wrote:
Hi Kendall,

> csv:serialize(<csv><record><a>A</a></record></csv>, map {'newline': '&#x0D;&#x0A;'})

The 'newline' option is a general serialization option; it cannot be
used with csv:serialize. If you want to take advantage of it, you
should use fn:serialize with method:csv [1]. Maybe my previous
response to George gives some more insight into the difference between
general and CSV serialization parameters.

> Also, about the csv option in RESTXQ, I have a need to accept a variety of CSV formats, some day in the future. For examples, CSV might have left and right quote characters, instead of double quote characters, or one uses commas another uses semi-colons, etc. Do you have a suggestion about how to handle that and still use the csv input option?

The CSV module provides support for custom field separators, so
switching from commas to semi-colons should be no problem [2].
Regarding left and quote characters, do you refer to “ and ” (201C /
201D) ? Do you have some more background information for us how these
characters come into play in your scenario?

Thanks in advance,
Christian

[1] https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.basex.org_wiki_Serialization&d=DwIFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=ZAZOc3Olja5l-6mlonje0zklw5GkCf31gPC4KYPwEuQ&s=LV1ACCRgeKxZ5oZLbOt2pG66GDtuAxOXYDIoP5WbC1k&e=
[2] https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.basex.org_wiki_CSV-5FModule&d=DwIFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=ZAZOc3Olja5l-6mlonje0zklw5GkCf31gPC4KYPwEuQ&s=BVh9mjMYLAWJJN7X6CsaSTRyxjlXUS8ucaaj2J_fZdU&e=

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [basex-talk] Serialization options