Greetings,
First of all, thank you very much for BaseX. It has made many of my assignments this semester doable, more enjoyable, and better. Using it, I could demonstrate database features at scale where only toy minimal examples were required.
The source of most of my data has been NIST's National Vulnerability Database https://nvd.nist.gov/vuln/data-feeds and the Mitre curated Common Weakness Enumeration Lists https://cwe.mitre.org/data/downloads.html. Using BaseX, I found and reported to NIST some errors in the CVE database and the errors have since been fixed. I'm sure there are many more fixes and enhancements possible with the CVE database.
I almost exclusively use BaseX/GUI (version 9.0.2, then 9.1, and now 9.1.1) on Fedora Linux. I do have some issues using BaseX/GUI and I am hoping that some improvements can be made.
Eventually, BaseX/GUI uses all of it's allocated memory. Even after increasing the max memory to 3.5GB, eventually it is all used and BaseX/GUI essentially freezes. Any operation that freezes, executes quickly and completely after quitting/killing BaseX/GUI and then restarting. It seems that some memory just never gets freed as I develop different XQuery routines in the Editor, run them, save them to files, click in various places on the Map Visualization, run XQuery on the Input Bar, etc. Sometimes closing the database frees the memory, mostly it doesn't. Once I think memory was freed when I saved a file in the Editor. The Java error messages all seem to relate to running out of memory. Hitting the "GC" button never seems to help. I don't have a specific sequence of actions that eventually consumes all the memory.
An example database that will demonstrate this memory consumption is composed of NVD-CVE-1.0-2018 https://nvd.nist.gov/feeds/json/cve/1.0/nvdcve-1.0-2018.json.zip and CWE Comprehensive View https://cwe.mitre.org/data/xml/views/2000.xml.zip. The database takes about 109 MB and has about 5.25 million nodes. Just viewing the Visualization Map, clicking around, and running/editing queries like this will eventually use all the memory.
let $cwe := distinct-values
( for $c in //cve//problemtype__data//value return tokenize($c, "-")[last()] ) for $c in $cwe where empty(//Weakness_Catalog[contains(@Name, "CWE-2000")]//Weakness[@ID = $c]) and empty(//Weakness_Catalog[contains(@Name, "CWE-2000")]//Related_Weakness[@CWE_ID = $c]) order by number($c) return $c
`java -version` reports:
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-b12) OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
I will gladly provide any additional info that may help to diagnose these symptoms, etc.
Thanks and best regards. RG