Thanks. Here’s one way how you could handle that:
declare function local:correct($old, $new) { empty(local:compare($old, $new)) };
declare function local:compare($old, $new) { if(head($old) = head($new)) then ( (: skip identical lines :) local:compare(tail($old), tail($new)) ) else if(empty($old)) then ( (: old lines are consumed, all fine :) () ) else if(empty($new)) then ( (: old lines remain, bad :) 'sigh' ) else ( (: lines differ: skip new line :) local:compare($old, tail($new)) ) };
let $old := unparsed-text-lines('before.txt') for $file in ( 'after-correct.txt', 'after-fails1.txt', 'after-fails2.txt' ) let $new := unparsed-text-lines($file) return $file || ': ' || local:correct($old, $new)
Here’s another solution, which avoids recursive calls and the need to optimize it for tail calls. It’s based on the handsome built-in higher-order function hof:until. Once again, it was Leo (Wörteler) who introduced it to BaseX [1]:
declare function local:correct($old, $new) { let $result := hof:until( (: test if there’s something to compare :) function($m) { empty($m?old) or empty($m?new) }, (: compare, create new input :) function($m) { map { 'old': if(head($m?old) = head($m?new)) then tail($m?old) else $m?old, 'new': tail($m?new) } }, (: initial input :) map { 'old': $old, 'new': $new } ) (: check if there’s old input left :) return empty($result?old) };
Cheers, Christian
[1] https://docs.basex.org/wiki/Higher-Order_Functions_Module#hof:until _________________________________
On Sat, Jun 12, 2021 at 9:38 PM Graydon graydonish@gmail.com wrote:
Hello!
On Sat, Jun 12, 2021 at 07:11:00PM +0200, Christian Grün scripsit:
Could you share exemplary and minimized input documents with us?
I have created some text documents and attached them. (The "remember to throw away some metadata markup, etc." step on the way to getting text to compare from the before and after vocabularies is believed to work reliably.)
Can the structure of the documents (hierarchy of nodes, element names, etc.) be completely ignored?
It can! This test is meant to test only that no words have been lost or re-ordered; that the transformation is semantically correct is out of scope for it.
Thank you! Graydon