Graydon, That seems like a good solution. I will pursue it. My only practical wrinkle is that I’m reading from local git clones so I have to make sure I’ve attempted to load any files pulled since the last load before checking for failed-to-load files, but that’s doable of course. Cheers, E. _____________________________________________ Eliot Kimber Sr Staff Content Engineer O: 512 554 9368 M: 512 554 9368 servicenow.com<https://www.servicenow.com> LinkedIn<https://www.linkedin.com/company/servicenow> | Twitter<https://twitter.com/servicenow> | YouTube<https://www.youtube.com/user/servicenowinc> | Facebook<https://www.facebook.com/servicenow> From: Graydon <graydonish@gmail.com> Date: Saturday, February 26, 2022 at 9:05 AM To: Eliot Kimber <eliot.kimber@servicenow.com> Cc: basex-talk@mailman.uni-konstanz.de <basex-talk@mailman.uni-konstanz.de> Subject: Re: [basex-talk] Identify Unparseable XML Files in File System [External Email] On Sat, Feb 26, 2022 at 02:53:46PM +0000, Eliot Kimber scripsit:
But maybe there’s a more direct way that I’ve overlooked?
If you trust the load process, you can get what's on disk with file:list(), and you can get what's in the system with some variation on collection()/document-uri(). You would then have to adjust the path names a little so they've got the same notional root. Once you've done that, $disk[not(. = $system)] tells you which files aren't well-formed. I'd expect this to be pretty brisk, and you don't have to try to parse anything a second time. -- Graydon Saunders | graydonish@gmail.com Þæs oferéode, ðisses swá mæg. -- Deor ("That passed, so may this.")