Hi Everyone,
I'm thinking about what might be involved in making a BaseX email database that would allow large numbers of emails to be searched on things like sender, recipient, subject, date, contents, as well as an existing hierarchical structure for organizing emails. (E.g. all emails relating to a particular project are placed in a corresponding project folder.)
The emails would have to be periodically exported from an existing email archive, presumable going by way of a (very big) PST file.
Has anyone here already tried to do this kind of thing using BaseX? Any tips would be much appreciated. Does anyone know of a good open source program for converting PST files to an XML structure? (I don't care about attachments at the moment, though it would be good to eventually include non-binary attachments in the search facility.)
Best,
Tim Finney
Hi Tim, Apache Tika can convert Outlook .msg and .pst files to XHTML, capturing a fair amount of metadata from each message. https://tika.apache.org/1.13/formats.html#Mail_formats Joe
Sent from my iPhone
On Wed, Jun 15, 2016 at 8:59 PM -0400, "Finney, Tim" Timothy.Finney@health.wa.gov.au wrote:
Hi Everyone,
I'm thinking about what might be involved in making a BaseX email database that would allow large numbers of emails to be searched on things like sender, recipient, subject, date, contents, as well as an existing hierarchical structure for organizing emails. (E.g. all emails relating to a particular project are placed in a corresponding project folder.)
The emails would have to be periodically exported from an existing email archive, presumable going by way of a (very big) PST file.
Has anyone here already tried to do this kind of thing using BaseX? Any tips would be much appreciated. Does anyone know of a good open source program for converting PST files to an XML structure? (I don't care about attachments at the moment, though it would be good to eventually include non-binary attachments in the search facility.)
Best,
Tim Finney
basex-talk@mailman.uni-konstanz.de