Hi Kendall,
Good point ;)
As left/right quote characters (and all the regional variants) could indeed be proper input, it’s difficult to define generic rules that work for all kind of data that is feeded into BaseX. I think the best solution is to first retrieve the (let’s call it) CSV data as plain text and get it regex’ed, based on the experiences with previous user input. The result can then be converted via csv:parse.
Hope this helps, Christian
On Thu, Nov 30, 2017 at 6:42 PM, Kendall Shaw kendall.shaw@workday.com wrote:
Hi,
Before people post CSV files to a web service, they bring a troop of Guerillas into a room and give them computer keyboards. After that the guerillas pound on the keyboards and then click send. So, the CSV files that arrive can have different separators between each other and they use Microsoft word, perhaps, to type CSV text sometimes, maybe, and so there will be files surrounded by what looks like double quotes but is actually the left and right quote characters and all sorts of other problems.
In some cases, an error other than generic failure is wanted so that a user can know roughly what the problem is. In other cases, differences might be better to resolve in the web service, e.g. semi-colon vs. comma.
Kendall
On 11/30/17, 9:30 AM, "Christian Grün" christian.gruen@gmail.com wrote:
Hi Kendall, > csv:serialize(<csv><record><a>A</a></record></csv>, map {'newline': '
'}) The 'newline' option is a general serialization option; it cannot be used with csv:serialize. If you want to take advantage of it, you should use fn:serialize with method:csv [1]. Maybe my previous response to George gives some more insight into the difference between general and CSV serialization parameters. > Also, about the csv option in RESTXQ, I have a need to accept a variety of CSV formats, some day in the future. For examples, CSV might have left and right quote characters, instead of double quote characters, or one uses commas another uses semi-colons, etc. Do you have a suggestion about how to handle that and still use the csv input option? The CSV module provides support for custom field separators, so switching from commas to semi-colons should be no problem [2]. Regarding left and quote characters, do you refer to “ and ” (201C / 201D) ? Do you have some more background information for us how these characters come into play in your scenario? Thanks in advance, Christian [1] https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.basex.org_wiki_Serialization&d=DwIFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=ZAZOc3Olja5l-6mlonje0zklw5GkCf31gPC4KYPwEuQ&s=LV1ACCRgeKxZ5oZLbOt2pG66GDtuAxOXYDIoP5WbC1k&e= [2] https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.basex.org_wiki_CSV-5FModule&d=DwIFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=ZAZOc3Olja5l-6mlonje0zklw5GkCf31gPC4KYPwEuQ&s=BVh9mjMYLAWJJN7X6CsaSTRyxjlXUS8ucaaj2J_fZdU&e=