Hi Ben, Yes, that’s possible. Office files are simple ZIP archives, so you can create a database with ZIP parsing turned on. If you supply a Word file to the collection() function, the document will be parsed on-the-fly. Just run the following query on the attached document: collection('HelloWorld.docx')//text()[. contains text 'hello'] In practice, you’ll surely have to invest some more time, as an Office text string may be distributed across multiple nodes. Best, Christian On Tue, Jan 28, 2020 at 2:01 PM Ben Engbers <Ben.Engbers@be-logical.nl> wrote:
Hi,
While we were discussing possible usecases for basex, a colleague asked me if it is also possible to load libreoffice and Word documents into Basex and then perform full-text analysis on them. In essence, these are both XML files, so it should be possible.
Does anybody have experience with this?
Ben