Hi, This may seem a bit unusual, but I like the parsing capabilities of XQuery and I'd like to use it for my email. I'd like to have BaseX log into my IMAP server, fetch emails and put each email as a separate XML document. I'm willing to have it ignore attachments if necessary. Then, I'd like to query my email using normal BaseX XQuery. This isn't for normal emails, but I wanted to be able to query it for automated emails and for one-off type queries.
Oh, to make things more difficult, I'm using GMail, so the login just got even more difficult.
Any suggestions?
Thanks, Ben Best Regards, Ben Pracht 919.809.2439 ben.pracht@gmail.com
Hi Ben -
On Tue, Jul 5, 2022 at 7:36 PM Ben Pracht ben.pracht@gmail.com wrote:
Hi, This may seem a bit unusual, but I like the parsing capabilities of XQuery and I'd like to use it for my email. I'd like to have BaseX log into my IMAP server, fetch emails and put each email as a separate XML document. I'm willing to have it ignore attachments if necessary. Then, I'd like to query my email using normal BaseX XQuery. This isn't for normal emails, but I wanted to be able to query it for automated emails and for one-off type queries.
Oh, to make things more difficult, I'm using GMail, so the login just got even more difficult.
Any suggestions?
An(other!) idea I've been kicking around (and wondering about) but not
testing/exploring/trying: integrating the Google OAuth2 api client(s) [1,2]. I don't know how it would work (or fit together), but that seems like a reasonable place to start.
Thanks,
Ben
I hope this is helpful. Best, Bridger
[1] https://developers.google.com/api-client-library/java/google-api-java-client... [2] https://github.com/googleapis/google-auth-library-java
Best Regards, Ben Pracht 919.809.2439 ben.pracht@gmail.com
Hi Ben and Bridger,
I think you'd need to write a separate application or module to handle this, or use workarounds. Here are a few options you could explore:
1) Use the Google API Client libraries https://developers.google.com/api-client-library to write an application in your preferred language that queries all messages and saves them in XML to use in a BaseX DB. I've used the PHP client library for Gmail, Drive, Groups, and Analytics reports, and it made the initial OAuth token setup relatively painless (as in, I was only tearing my hair out for a single day, not multiple).
2) Write a script in your preferred language that uses IMAP to fetch and save the messages. PHP https://www.php.net/manual/en/book.imap.php, Java https://javaee.github.io/javamail/, and Python https://docs.python.org/3/library/imaplib.html have built-in or third-party client libraries. But as Ben said, authentication will be a pain because Gmail won't allow apps to authenticate with just usernames and passwords anymore. You'd need to build in OAuth with your own methods.
3) Find an existing web client to sync to your Gmail, and find where it stores local copies of the messages. Use them to create a BaseX DB. It might be difficult or impossible to find one that saves messages in neat, convenient XML files, though, so you might have to export an archive and do some file processing processing every time you want to perform a query.
-Tamara
On Tue, Jul 5, 2022 at 7:25 PM Bridger Dyson-Smith bdysonsmith@gmail.com wrote:
Hi Ben -
On Tue, Jul 5, 2022 at 7:36 PM Ben Pracht ben.pracht@gmail.com wrote:
Hi, This may seem a bit unusual, but I like the parsing capabilities of XQuery and I'd like to use it for my email. I'd like to have BaseX log into my IMAP server, fetch emails and put each email as a separate XML document. I'm willing to have it ignore attachments if necessary. Then, I'd like to query my email using normal BaseX XQuery. This isn't for normal emails, but I wanted to be able to query it for automated emails and for one-off type queries.
Oh, to make things more difficult, I'm using GMail, so the login just got even more difficult.
Any suggestions?
An(other!) idea I've been kicking around (and wondering about) but not
testing/exploring/trying: integrating the Google OAuth2 api client(s) [1,2]. I don't know how it would work (or fit together), but that seems like a reasonable place to start.
Thanks,
Ben
I hope this is helpful. Best, Bridger
[1] https://developers.google.com/api-client-library/java/google-api-java-client... [2] https://github.com/googleapis/google-auth-library-java
Best Regards, Ben Pracht 919.809.2439 ben.pracht@gmail.com
Hi Tamara -
On Fri, Jul 8, 2022, 12:07 PM Tamara Marnell tmarnell@orbiscascade.org wrote:
Hi Ben and Bridger,
I think you'd need to write a separate application or module to handle this, or use workarounds. Here are a few options you could explore:
- Use the Google API Client libraries
https://developers.google.com/api-client-library to write an application in your preferred language that queries all messages and saves them in XML to use in a BaseX DB. I've used the PHP client library for Gmail, Drive, Groups, and Analytics reports, and it made the initial OAuth token setup relatively painless (as in, I was only tearing my hair out for a single day, not multiple).
This was the intent of my earlier email - a module that wraps the Java client(s) - but your thoughts are much more thorough! I appreciate the suggestions for alternative langs, too. Good to know about the limited potential for self-driven hair loss, too - it's a concern :) :).
- Write a script in your preferred language that uses IMAP to fetch and
save the messages. PHP https://www.php.net/manual/en/book.imap.php, Java https://javaee.github.io/javamail/, and Python https://docs.python.org/3/library/imaplib.html have built-in or third-party client libraries. But as Ben said, authentication will be a pain because Gmail won't allow apps to authenticate with just usernames and passwords anymore. You'd need to build in OAuth with your own methods.
- Find an existing web client to sync to your Gmail, and find where it
stores local copies of the messages. Use them to create a BaseX DB. It might be difficult or impossible to find one that saves messages in neat, convenient XML files, though, so you might have to export an archive and do some file processing processing every time you want to perform a query.
-Tamara
Thanks for the email! Best, Bridger
On Tue, Jul 5, 2022 at 7:25 PM Bridger Dyson-Smith bdysonsmith@gmail.com wrote:
Hi Ben -
On Tue, Jul 5, 2022 at 7:36 PM Ben Pracht ben.pracht@gmail.com wrote:
Hi, This may seem a bit unusual, but I like the parsing capabilities of XQuery and I'd like to use it for my email. I'd like to have BaseX log into my IMAP server, fetch emails and put each email as a separate XML document. I'm willing to have it ignore attachments if necessary. Then, I'd like to query my email using normal BaseX XQuery. This isn't for normal emails, but I wanted to be able to query it for automated emails and for one-off type queries.
Oh, to make things more difficult, I'm using GMail, so the login just got even more difficult.
Any suggestions?
An(other!) idea I've been kicking around (and wondering about) but not
testing/exploring/trying: integrating the Google OAuth2 api client(s) [1,2]. I don't know how it would work (or fit together), but that seems like a reasonable place to start.
Thanks,
Ben
I hope this is helpful. Best, Bridger
[1] https://developers.google.com/api-client-library/java/google-api-java-client... [2] https://github.com/googleapis/google-auth-library-java
Best Regards, Ben Pracht 919.809.2439 ben.pracht@gmail.com
--
Tamara Marnell Program Manager, Systems Orbis Cascade Alliance (orbiscascade.org https://www.orbiscascade.org/) Pronouns: she/her/hers
basex-talk@mailman.uni-konstanz.de