Hi Christian,
On 1/15/19 12:43 PM, Christian Grün wrote:
What are your experiences with using a single thread? If memory consumption is too exhaustive, you could play with the window clause of the FLWOR expression [2,3]. It takes some time to explore the full magic of this XQuery 3.0 extension (the syntax is somewhat verbose), but it’s often a good alternative to complex functional code.
Using a single thread looks to be OK too, about 10k lines per second, and I'm not sure reading the same file with 16 threads (on SSD) is the way to go from an I/O point of view. Searching on stackoverflow there are many suggestions on how to read a file with one or multiple threads e.g [1]
I immediately return the data I need for each line (a small string for example) so the memory consumption is low, I have provided 12GB but I never see over 2-3GB of memory usage. My initial thoughts were that maybe garbage collection was causing delays but after profiling BaseX I don't think this is an issue. It's interesting to know about the window function though, I will certainly find a use for it. While I know most of these functions exist, I can always learn much more about a language. Only yesterday I managed to use fork-join successfully and I think it will save me a lot of time and effort for my use cases. I will post again if I have any updates, thanks again,
George.
[1]: https://stackoverflow.com/questions/40412008/how-to-read-a-file-using-multip...