Case example:
A publication is defined by a tree structure that references a bunch of other files that also reference a bunch of other files. In order to create an aggregate to transform with fo and create a PDF, we need to open all files and merge them. In the merge we also query a lot of small variables stored in different files (may a dozen per file referenced by the main tree). For example if I look for an official variable value for a product code in a specific language, I go for: db:open('resources')/*[id='model-definitions']/descendant::*[@id=$desired-model]/*[@xml:lang='zw-th']/node().
If I could do db:node-id('resources', 'model-definition#' || $desired-model)/*[@xml:lang='zw-th']/node() and leverage the fact that this info is indexed natively, I do believe that it would be faster. I am currently working hard on performance. It used to take 7 minutes to aggregate our longest publication (for one lang so multiply by 55 for all languages for all) and now it takes a bit less than 2 minutes. I'm aiming for 30 sec or less so yes, a few hundreds faster db and node access by id have an impact.
P.S. I still have not built custom indices... you may get questions about that in future emails.