Congrats on the latest version! Looking forward as usual to exploring the new features.
However, I'm perplexed by the decision to remove the text parser from the codebase. I understand the desire to streamline and remove dependencies related to lower-value features, but I've always found the text parser to be super useful. After installing Basex 10.8 beta today, I had to refactor a process (parsing a set of interview transcripts generated by Zoom) that involved creating a DB from a directory of text files.
In addition, I noticed some unexpected results in how the text was parsed using standard methods. In BaseX 10.6, using the text parser in the GUI, the output looks like this:
<text>WEBVTT
1
00:00:02.910 --> 00:00:27.240
...
</text>
Here, each line end is just a newline character (\n).
Using file:read-text or fn:unparsed-text (in 10.6 and 10.8 beta), the output looks like this:
<text>WEBVTT

1
00:00:02.910 --> 00:00:27.240
...
</text>
Here, each line end also has a carriage return (\r).
And if instead, I store it as an XQuery value, I see the newline characters that aren't otherwise displayed in the GUI:
"WEBVTT

1
00:00:02.910 --> 00:00:27.240
..."
So, the text parser seems to have done some normalization, which was also helpful.
Any chance that it could be restored (by popular demand) in version 11? :)
Best regards,
Tim
--
Tim A. Thompson (he, him)
Librarian for Applied Metadata Research
Yale University Library