In my experience the case that causes the most problem is the authentication redirect. I have never tried this with BaseX but I have been very grateful in the past that XMLCalabash implements this:
"The exception arises in the case of redirection. If a redirect response includes cookies, those cookies are forwarded as appropriate to the redirected location when the redirection is followed." [1] /Andy
[1] http://xprocbook.com/book/refentry-19.html#cookies
On 10 July 2015 at 10:36, Florent Georges fgeorges@fgeorges.org wrote:
Hi,
Correct me if I am wrong, but I believe the HTTP Client in BaseX is the EXPath HTTP Client? It was indeed designed to provide access to low-level, raw HTTP. It does not contain a lot of higher level feature based on HTTP itself. Indeed, you have to handle cookies yourself for instance.
The difficulty here, if I am right, is the side-effects required to pass information somehow (in a hidden way) between 2 different HTTP requests.
Any suggestion to improve the API is welcome (at least on the EXPath mailing list, I don't want to speak for BaseX developers, but I am pretty sure here as well :-)...)
Regards,
-- Florent Georges http://fgeorges.org/ http://h2oconsulting.be/
On 10 July 2015 at 11:13, Christian Grün wrote:
Hi Vincent,
So far, I'm not aware of a standard solution to handle and cache client-side cookies with BaseX. Could you show us your solution? It might help us to discuss alternative solutions.
Best, Christian
On Thu, Jul 9, 2015 at 8:30 PM, Lizzi, Vincent Vincent.Lizzi@taylorandfrancis.com wrote:
I am using BaseX to scrape data from a web site. This web site, probably like many other websites, relies on cookies and if it does not receive
the
expected cookies it delivers a page instructing you to enable cookies in your browser. I was able to get this working by parsing the http:header response to get the cookies to use in subsequent requests. This is the second time I’ve done this, and even though this works it seems a bit
hacky.
Is there a standard way of handling cookies using the HTTP Module or the Fetch module? Or, are there any well written code examples available?
In other environments typically you define a cookie jar in some way,
and the
cookie jar is used (and is updated) automatically in all subsequent HTTP requests. I’m hoping to find something similar in BaseX.
Thanks, Vincent