Hi all, I'm attempting to query a fairly large database, with 136,938 resources and a size of 12,257,686,099. The basex server itself is an AWS EC2 instance with 4 cores & 16 gigs of ram, using -Xmx12g. The database contains audio DDEX information which (to me at least) is fairly complicated XML. The query I'm running against the database works and gives me the result I want, but takes upwards of 15 minutes to execute. I've taken a look at the wiki documentation regarding indexes, but it's not clear to me that any of the non-default indexes would help for the query I'm running. That query is:
declare namespace ernm="http://ddex.net/xml/ern/411";
for $r in /ernm:NewReleaseMessage for $track_release in $r/ReleaseList/TrackRelease for $party in $r/PartyList/Party for $sound_recording in $r/ResourceList/SoundRecording for $release in $r/ReleaseList/Release where $track_release/ReleaseLabelReference = $party/PartyReference and $track_release/ReleaseResourceReference = $sound_recording/ResourceReference and $track_release/ReleaseResourceReference = $release/ResourceGroup/ResourceGroup/ResourceGroupContentItem/ReleaseResourceReference return <identity> <isrc>{ $track_release/ReleaseId/ISRC/text() }</isrc> <artist>{ fn:string-join($sound_recording/DisplayArtistName, '/') }</artist> <title>{ $sound_recording/DisplayTitleText/text() }</title> <album>{ $release/DisplayTitleText/text() }</album> <icpn>{ $release/ReleaseId/ICPN/text() }</icpn> <sublabel>{ $party/PartyName/FullName/text() }</sublabel> </identity>
Am I wrong, and would an additional value index help here? Or is my query just bad?
Thanks, Bill
p.s. The BaseX GUI client is awesome - especially the treemap view. Really helps with wrangling these XML files!