Hi, Correct me if I am wrong, but I believe the HTTP Client in BaseX is the EXPath HTTP Client? It was indeed designed to provide access to low-level, raw HTTP. It does not contain a lot of higher level feature based on HTTP itself. Indeed, you have to handle cookies yourself for instance. The difficulty here, if I am right, is the side-effects required to pass information somehow (in a hidden way) between 2 different HTTP requests. Any suggestion to improve the API is welcome (at least on the EXPath mailing list, I don't want to speak for BaseX developers, but I am pretty sure here as well :-)...) Regards, -- Florent Georges http://fgeorges.org/ http://h2oconsulting.be/ On 10 July 2015 at 11:13, Christian Grün wrote:
Hi Vincent,
So far, I'm not aware of a standard solution to handle and cache client-side cookies with BaseX. Could you show us your solution? It might help us to discuss alternative solutions.
Best, Christian
On Thu, Jul 9, 2015 at 8:30 PM, Lizzi, Vincent <Vincent.Lizzi@taylorandfrancis.com> wrote:
I am using BaseX to scrape data from a web site. This web site, probably like many other websites, relies on cookies and if it does not receive the expected cookies it delivers a page instructing you to enable cookies in your browser. I was able to get this working by parsing the http:header response to get the cookies to use in subsequent requests. This is the second time I’ve done this, and even though this works it seems a bit hacky. Is there a standard way of handling cookies using the HTTP Module or the Fetch module? Or, are there any well written code examples available?
In other environments typically you define a cookie jar in some way, and the cookie jar is used (and is updated) automatically in all subsequent HTTP requests. I’m hoping to find something similar in BaseX.
Thanks, Vincent