Crazy - it works!

Many thanks, Hauke!

With kind regards,
Hans-Jürgen

Am Freitag, 11. Oktober 2024 um 14:49:23 MESZ hat Hauke Brandes <hauke.brandes@parsqube.de> Folgendes geschrieben:


Hello Hans-Jürgen,

To detect if a file starts with a BOM, you can use the following code:

let $file := 'some-file.xml'
let $bytes := array { $file => file:read-binary(0, 4) => bin:to-octets() }
let $boms := (
  (: UTF-8 :)        [0xef, 0xbb, 0xbf],
  (: UTF-16LE :)     [0xfe, 0xff],
  (: UTF-16BE :)     [0xff, 0xfe],
  (: UTF-32BE BOM :) [0x00, 0x00, 0xfe, 0xff],
  (: UTF-32LE BOM :) [0xff, 0xfe, 0x00, 0x00]
)
return some $bom in $boms satisfies deep-equal(array:subarray($bytes, 1, array:size($bom)), $bom)

But I would have suspected that fn:doc would be okay with XML files including a BOM, since those are legal XML files as far as I remember.

Greetings,
Hauke

Am 11.10.24 um 13:42 schrieb Hans-Juergen Rennau:
Dear BaseX people,

there is a serializatioin parameter for adding a byte-order-mark when serializing. Such a document cannot be parsed using fn:doc() - error message:

[FODC0002] "C:/projects/ofx-works/work/test-bom.xml" (Line 1): Content ist nicht zulässig in Prolog.

This message is rather unspecific. My question: is there a way how to determine if a byte-order-mark is used in a given file?

Kind regards,
Hans-Jürgen