In our projects though, we are using https://jsoup.org/ and it works well, also very easy to use.
Interesting. Is it possible to use it for converting HTML to XML?
I'm talking about calls that happend using XQuery doc(http://randomhost.rn/random.xml).
I see. So it probably sends requests headers like "Accept-Encoding: x-compress; x-zip" to the server and unzips the result, is this right?
Maybe we could easily realize something similar in BaseX without an additional library, at least for (g)zipped streams. (because I still try to keep the BaseX distribution as small as possible...). There is already an existing issue for that [1]. I don’t know much about HTTP caching so far, though.
Cheers, Christian
[1] https://github.com/BaseXdb/basex/issues/1381
I'm not sure if they request gzipped
files. I think I've tested it once and it didn't. For example trying to get a 233MB XML file using gzip compression, will only need to fetch 27.8MB (this is a random file, the compression may vary for different XML files). We are working with files that can be over 1GB, so it can make a difference in bandwidth and execution (compilation) time.