On Thu, 2022-11-17 at 19:05 +0100, Christian Grün wrote:
But is there no way to declare that when I import a file to the database?
There's currently no way to supply this for specific elements
Both XML Schema and DTDs do have a way to say whether text is allowed in a particular context, and the XML loader could use this information to discard whitespace text nodes that aren't text. On how it came to be - SGML had some really bad whitespace rules, including what was called "pernicious whitespace" - whitespace where the parser needed backtracking to know if was text or not, but the parsers didn't actually do backtracking so they flagged it as an error. This was a very common source of problems for users. We eliminated this for XML by requiring #PCDATA (i.e. text) always to be in a repeatable or-group, so <!ELEMENT boy (noise|dirt|#PCDATA)*> and not <!ELEMENT boy (noise*, dirt*, #PCDATA)> (to paraphrase Ambrose Beirce's Devil's Dictionary, which defined a boy as a noise with dirt on it). liam -- Liam Quin, https://www.delightfulcomputing.com/ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Barefoot Web-slave, antique illustrations: http://www.fromoldbooks.org