Dear Mike,
I'm sorry that the existing W3 specifications and (thus) XML parsers will never preserve entities in the document. Instead, the current paradigm is to transfer as much text as possible to UTF8. Next, the internal database representation is not aware of entities, as this would complicate keyword searches and many other things.
One could think about manually converting the serialized result to an entity-based representation, using some DTDs as input; but this is probably not what you want (and BaseX offers no features to perform such a thing).
Hope this helps, Christian ___________________________
Christian Grün Uni KN, Box 188 78457 Konstanz, Germany http://www.inf.uni-konstanz.de/~gruen
On Fri, May 6, 2011 at 7:30 PM, Mike Cobo mikecobo@yahoo.com wrote:
Hi, I have a collection of XML documents in which I want to replace only one attribute per file. The XML documents contain some XML entities for German "Umlaute" characters (eg. ü). After running the XQuery script which replaces the attributes, I use the "Export XML" function of the GUI to export the modified documents to a local folder. Unfortunaltey, the XML entities are now replaced by the corresponding "real" character (eg. ü). I tried the several parsing options of the "Create database" dialog, but already when creating the DB I can see the "real" character in the GUI text output window. As I don't wanna change anything else of the original XML content but the attribute, I need the output files to have the XML entities instead of the "real" characters (as in the original files). Here an simple example original file:
<?xml version="1.0" encoding="UTF-8"?>
<Message xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <header messageId="4523251"> </header> <dataSupply> <supplyProperty description="Gültigkeitsbeginn" value="12.12.2010"/> </dataSupply> </Message> And here's what I get as output: <?xml version="1.0" encoding="UTF-8"?> <Message xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <header messageId="000"/> <dataSupply> <supplyProperty description="Gültigkeitsbeginn" value="12.12.2010"/> </dataSupply> </Message> How can I get the export encoding right? best regards, Mike
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk