Thanks, Christian. Much appreciated advice. The only change needed was due to the fact that trials often have multiple ids of the same type (e.g., secondary_id) or empty ids (acronym is optional). Also, ids are sometimes actually lists of ids separated by commas (poor coding practice by sponsors). So the working code looks as follows:
let $maps := for $trial in db:open('CTGov')/clinical_study return ( for $id in $trial/id_info/org_study_id for $t in tokenize($id, ',') return map:entry(functx:trim($t), $trial), for $id in $trial/id_info/secondary_id for $t in tokenize($id, ',') return map:entry(functx:trim($t), $trial), for $id in $trial/acronym for $t in tokenize($id, ',') return map:entry(functx:trim($t), $trial) )
The new code completes in 10 seconds, a major improvement!
Best, Ron
On December 30, 2015 at 1:23:44 PM, Christian Grün (christian.gruen@gmail.com) wrote:
A last hint: Performance is mostly a matter of how maps are implemented in a particular XQuery processor. You may be able to speed up your query if you only iterate over all nodes only once:
let $maps := for $trial in db:open('CTGov')/clinical_study return ( map:entry(string($trial/id_info/org_study_id), $trial), map:entry(string($trial/id_info/secondary_id), $trial), map:entry(string($trial/acronym), $trial) ) return map:merge( for $map in $maps for $key in map:keys($map) group by $key let $value := ($map ! .($key))/self::node() return map { $key : $value } )
If map keys are converted to strings, lookup will be a bit faster. Next, the "/self::node()" step removes potential duplicate nodes (just skip it if the IDs are unique assigned to a single "clinical_study" element are distinct).
On Wed, Dec 30, 2015 at 7:13 PM, Ron Katriel rkatriel@mdsol.com wrote:
Done (Bug 29353).
Thanks, Ron
On December 30, 2015 at 9:57:00 AM, Christian Grün (christian.gruen@gmail.com) wrote:
Any chance this could be made an option of map:merge? Doing it in Java would be faster and more elegant.
It will be too late for XQuery 3.1, but feel free to motivate such enhancements in the W3 Bug Tracker [1].
Christian
[1] https://www.w3.org/Bugs/Public/
Best, Ron
On December 29, 2015 at 3:12:28 AM, Christian Grün (christian.gruen@gmail.com) wrote:
Try this:
let $maps := ( map:entry(0, "red"), map:entry(1, "green"), map:entry(1, "blue") ) return map:merge( for $map in $maps for $key in map:keys($map) group by $key return map { $key : $map ! .($key) } )
This is an equivalent, possibly better readable, solution:
let $maps := ( map:entry(0, "red"), map:entry(1, "green"), map:entry(1, "blue") ) let $keys := distinct-values( for $map in $maps return map:keys($map) ) return map:merge( for $key in $keys let $value := for $map in $maps return $map($key) return map { $key : $value } )
On Wed, Dec 23, 2015 at 11:04 PM, Ron Katriel rkatriel@mdsol.com wrote:
Hi,
I am using map:merge to construct a map from smaller maps and would like to preserve values when keys agree. For example, when calling
map:merge((map:entry(0, "red"), (map:entry(1, "green"), map:entry(1, "blue")))
I would like to get back something like
map { 0: "red", 1: ("green", "blue") }
The default (W3C) behavior is to drop "green" in favor of "blue".
Is there a simple way to accomplish this? I realize the above example is mixing types so presumably a solution would have all values as sets.
Thanks, Ron