On Wed, 11 Feb 2026 21:41:25 -0500 Graydon Saunders via BaseX-Talk <basex-talk@mailman.uni-konstanz.de> wrote:
If I have two (fairly long) sequences of text, ('The', 'words', 'are', 'sequence', 'members') and I want all the index numbers of matching pairs despite the sequences only mostly matching (so a word, or several words, can be missing from sequence A or sequence B), is there an established algorithm for doing this?
There are several - Myers, MacIlroy (of Unix fame) and others, have published papers on the longest matching subsequence problem that is at the heart e.g. of the Unix diff program. liam -- Liam Quin: Delightful Computing - Training and Consultancy in XSLT / XML Markup / Typography / CSS / Accessibility / and more... Outreach for the GNU Image Manipulation Program Vintage art digital files - fromoldbooks.org