Help dealing with slow API responses - BaseX-Talk - mailman.uni-konstanz.de

22 Nov 2021


      Hi all -
I have the sneaking suspicion that the answer to this plea for help will be
something like, "Use RESTXQ!!" or something similar, but let me describe
the problem: I'm pulling data back from an OAI-PMH endpoint that is slow;
i.e. response times are ~1/minute. Embarrassingly, I think I've spent
several hours trying to figure out if my requests were buggy before I
realized that the API endpoint was just *slow* (at least compared to others
that I use regularly).
I typically use the IntelliJ plugin or the BaseX GUI for the vast majority
of my XQuery work, but these requests effectively build up a sequence of
responses (as files) and then serialize them all to disk when the requests
have finished. When the requests are answered quickly, there's prompt
feedback (my script finishes quickly, I can see serialized files, etc), but
when the requests are answered slowly, I'm left waiting (and then thinking,
"Oh no - I've mistyped the endpoint URL." or something similar).
Initially I think I have two requests for guidance:
1. Is there a better way, using the BaseX GUI (or the command line), to get
feedback on a querying process like this? Something... asynchronous, or
something clever with builtin functions in the `jobs` or `xquery` modules?
2. If this can be addressed relatively directly with RESTXQ, I'm game to
(finally) get more comfortable with it, but does anyone have any examples,
applications, scripts, etc they would be willing to share? I want to say
that several examples and ideas have been shared before here on the list,
but I'm having a terrible time finding them.
I've attached a simple SSCCE, where the basic idea is: query an API for
some data, and get a response like so:
<example>
  <stuff>...</stuff>
  <resumptionToken>abc123</resumptionToken>
</example>
then take the text() of the resumptionToken, and resubmit a request to the
API, which would return:
<example>
  <stuff>...</stuff>
  <resumptionToken>def456</resumptionToken>
</example>
The full responses are written to the temp directory on your system
(file:temp-dir()), with a date-stamp name.
The endpoint in my attached example has a fairly small response from what
I've selected, but the idea would be: how let me (or another user) know
that the query is active and running, not hung up or failing? I understand
that given the functional nature of XQuery these sorts of things can be a
bit more complicated, so I would appreciate any thoughts, opinions, links,
etc.
Many thanks for your time and trouble.
Best,
Bridger
PS the endpoint in the example doesn't necessarily exhibit the same slow
behavior that is the basis for my woes -- for some list readers it may be
very fast indeed -- but I felt like it might be apropos to use this
particular endpoint for an illustration.