When I run this using Saxon (substituting fn:doc for fn:collection), it only takes a second to execute. In BaseX 8.2.3 (using the GUI), however, it takes a very long time to execute (around 6 minutes!). The BaseX database has both attribute and text indexes enabled.
Any idea what is causing it to take so long?
Compiling:
- pre-evaluating collection("latin_hold_20150730")
- pre-evaluating collection("latin_hold_20150730")
Query:
declare namespace marc="
http://www.loc.gov/MARC21/slim"; for $m in collection( "latin_hold_20150730" )/marc:collection/marc:record, $r in collection( "latin_hold_20150730" )/root/row[BIB_ID = $m/marc:controlfield[@tag = '001']] group by $key := $r/ITEM_ID return <test n="{ $key }">{ $m/marc:datafield[@tag = '245']/marc:subfield[@code = 'a']/string() }</test>
Optimized Query:
for $m_0 in (db:open-pre("latin_hold_20150730",0), ...)/marc:collection/marc:record for $r_1 in (db:open-pre("latin_hold_20150730",0), ...)/root/row[(BIB_ID = $m_0/marc:controlfield[(@tag = "001")])] let (: post-group :) $m_4 := $m_0 group by $key_2 := $r_1/ITEM_ID return element test { (attribute n { ($key_2) }, $m_4/marc:datafield[(@tag = "245")]/marc:subfield[(@code = "a")]/string()) }
Result:
- Hit(s): 7587 Items
- Updated: 0 Items
- Printed: 505 KB
- Read Locking: local [latin_hold_20150730]
- Write Locking: none
Timing:
- Parsing: 0.43 ms
- Compiling: 1.96 ms
- Evaluating: 383737.5 ms
- Printing: 16.41 ms
- Total Time: 383756.3 ms
Query plan:
<QueryPlan compiled="true">
<GFLWOR>
<For>
<Var name="$m" id="0"/>
<IterPath>
<DBNodeSeq size="2">
<DBNode name="latin_hold_20150730" pre="0"/>
<DBNode name="latin_hold_20150730" pre="365692"/>
</DBNodeSeq>
<IterStep axis="child" test="marc:collection"/>
<IterStep axis="child" test="marc:record"/>
</IterPath>
</For>
<For>
<Var name="$r" id="1"/>
<IterPath>
<DBNodeSeq size="2">
<DBNode name="latin_hold_20150730" pre="0"/>
<DBNode name="latin_hold_20150730" pre="365692"/>
</DBNodeSeq>
<IterStep axis="child" test="root"/>
<IterStep axis="child" test="row">
<CmpG op="=">
<CachedPath>
<IterStep axis="child" test="BIB_ID"/>
</CachedPath>
<IterPath>
<VarRef>
<Var name="$m" id="0"/>
</VarRef>
<IterStep axis="child" test="marc:controlfield">
<CmpG op="=">
<CachedPath>
<IterStep axis="attribute" test="tag"/>
</CachedPath>
<Str value="001" type="xs:string"/>
</CmpG>
</IterStep>
</IterPath>
</CmpG>
</IterStep>
</IterPath>
</For>
<GroupBy>
<Spec>
<Var name="$key" id="2"/>
<IterPath>
<VarRef>
<Var name="$r" id="1"/>
</VarRef>
<IterStep axis="child" test="ITEM_ID"/>
</IterPath>
</Spec>
</GroupBy>
<CElem>
<QNm value="test" type="xs:QName"/>
<CAttr>
<QNm value="n" type="xs:QName"/>
<VarRef>
<Var name="$key" id="2"/>
</VarRef>
</CAttr>
<MixedPath>
<VarRef>
<Var name="$m" id="4"/>
</VarRef>
<IterStep axis="child" test="marc:datafield">
<CmpG op="=">
<CachedPath>
<IterStep axis="attribute" test="tag"/>
</CachedPath>
<Str value="245" type="xs:string"/>
</CmpG>
</IterStep>
<IterStep axis="child" test="marc:subfield">
<CmpG op="=">
<CachedPath>
<IterStep axis="attribute" test="code"/>
</CachedPath>
<Str value="a" type="xs:string"/>
</CmpG>
</IterStep>
<FnString name="string([item])"/>
</MixedPath>
</CElem>
</GFLWOR>
</QueryPlan>