Hi, Christian. Thanks for the recommendation. Actually I started thinking of a proxy in between the two servers that would perform data compression. The experiments with generic archivers on my typical xml datasets showed the best results with 7zip, both in terms of compression rate and performance. EXI however would outperform 7zip to some extent but not hugely (1.6% vs 2.4% on a 94Mb file). Anyway, it is a good idea to look at it end-to-end and find and resolve any bottlenecks. Thanks again! Cheers! Mikhail
Среда, 5 мая 2021, 13:03 +03:00 от Christian Grün christian.gruen@gmail.com: Hi Mikhail (cc to the list),
If your queries generate hundreds of Megabytes, you could check if the full result is actually required. If yes, you may need to check which step is responsible for letting you wait: Is it the pure transmission over the network, is it the evaluation time or is it the client that processes the query result?
If it’s the data traffic, and if you use Jetty as BaseX web server, you could enable GZIP compression [1]. If you use another web server, or if your architecture is more complex, you could install a proxy in between that takes care of compressing the transferred data.
Best, Christian
[1] https://docs.basex.org/wiki/Options#GZIP
On Wed, May 5, 2021 at 7:17 AM Mikhail Kuznetsov < mikekuznetsov@mail.ru > wrote:
Hi, Christian.
I am trying to build a central server for processing multiple QXuery tasks. In some cases the output will be used by a BI system for visualization. For example:
the BaseX server will accumulate a lot of "raw" xml files collected from multiple entities, say their tax returns this is great that I can have the flexibility to write new queries and adapt existing ones without modifying the underlying data (which would be the case if I had chosen an SQL database for storage) a Power BI server gets connected to BaseX through RESTful API and retrieves two kinds of data: individual by entity and benchmark based on the average.
Some queries may result in hundreds of Mb of xml data. My BaseX server is hosted in AWS, while the Power BI server is located internally. The communication takes too much time, which is a show stopper for me. I am looking for ways to optimize the data transfer. The BaseX server itself is perfect, many thanks for the great job.
Regards, Mikhail
Среда, 5 мая 2021, 2:45 +03:00 от Christian Grün < christian.gruen@gmail.com >:
Hi Mikhail,
Hi there. I have recently been looking at ways to improve the efficiency of my XQuery server and have come across Efficient XML Interchange (EXI) Format. Is there a way that BaseX could support encoding/ decoding for this format?
The internal storage of XML documents in BaseX differs quite a lot from the original and verbose textual representation (for which EXI provides improvements).
What kind of performance penalties did you encounter so far?
Best, Christian
On Tue, May 4, 2021 at 10:34 PM Mikhail Kuznetsov < mikekuznetsov@mail.ru > wrote:
Thanks. Mikhail
-- Mikhail Kuznetsov
-- Mikhail Kuznetsov