In my application I load numerous Oxygen validation report reports, consisting of many <incident> elements, where each incident is associated with a specific input document, captured in a <systemID> element:
<incident xmlns=http://www.oxygenxml.com/ns/report>
<engine>oXygen</engine>
<severity>error</severity>
<description>Cannot find definition for key "image.install-store-app-menu-path". Key Scopes:[[bundle-itsm], [reuse], [reuse]]. Keys are gathered from: bundle-itsm-it-service-management.ditamap.</description>
<systemID>/Users/eliot.kimber/git-basex/utah/doc/source/reuse/activation/install-store-apps-steps.dita</systemID>
<profile>suite-prod</profile>
<type>Key reference</type>
<location>
<start>
<line>33</line>
<column>26</column>
</start>
<end>
<line>33</line>
<column>61</column>
</end>
<length>35</length>
</location>
</incident>
I need correlate these incidents to the docs as stored in the database, where the match is just on the filename, not any part of the path (although I could match on the part of the path starting with “doc/source”).
In my testing, with about 27K of these incident elements, it takes about 60 seconds to build a set of maps where the keys are the filenames and the values are the sequences of <incident> elements that match a given filename, i.e.:
let $incidentsByDoc as map(*) :=
map:merge(
let $docNames as xs:string* := $incidents/report:systemID ! string(.) ! relpath:getName(.) => distinct-values()
return
for $docName in $docNames
return
map{
$docName : $incidents[report:systemID/text() contains text { '/' || $docName}]
}
)
I think the only optimization is to save the resulting map as an XML file or (with BaseX 10) just save the map for later use.
But I’m curious if there’s some other XPath-level optimization that would make this lookup faster?
I’m already using the text index via “contains text”, although I suspect that it’s not really offering an advantage over a simple ends-with(.) check.
Thanks,
Eliot
_____________________________________________
Eliot Kimber
Sr Staff Content Engineer
O: 512 554 9368
M: 512 554 9368