Hi Christian,
Am 18.11.2012 um 17:07 schrieb Christian Grün:
it looks as the query plan is still based on the nested predicates. Have you checked if the simplified form leads to the usage of index structures (provided that you have up-to-date index structures at this stage)?
I think, it does.
One more thing I noticed: ".//X[1]" is often expensive. Its is the same as "./descendant-or-self::node()/child::X[1]", which yields a large number of intermediary results. If you don't need the first "X" child element from all descendant-or-self nods, but rather the first descendant "X" element, I would suggest to rewrite the query to one of the two versions:
descendant::X[1] ..or (.//X)[1]
This cannot be done by the optimizer itself, because ".//X[1]" and "./descendant::X[1]" are not equivalent.
I' m not quite sure if I rewrote it correctly.
The respective lines are:
let $title := db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI[1]//*:fileDesc[1]//*:titleStmt[1]//*:title[1] let $author := db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI[1]//*:sourceDesc[1]//*:bibl[1]//*:author[1] let $note := db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI[1]//*:notesStmt//*:note
So for the title I go up to the first "TEI" element (/ancestor::*:TEI[1]) and from there I travel down until I detect the first "fileDesc" element and in this the first "titleStmt" element and then I take the first "title" element.
When I rewrite this into:
let $title := (db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI[1]//*:fileDesc)[1]//*:titleStmt[1]//*:title[1]
The results still look OK and it is faster.
But trying this:
let $title := (db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI[1]//*:fileDesc[1]//*:titleStmt)[1]//*:title[1]
the respective information will not be retrieved.
I don't know exactly where to put the brackets.
However, I also tried this:
let $tei := (db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI)[1] let $title := ($tei//*:fileDesc)[1]//*:titleStmt[1]//*:title[1] let $author := ($tei//*:sourceDesc)[1]//*:bibl[1]//*:author[1] let $note := $tei//*:notesStmt//*:note
I put the whole TEI node into a variable and then use this one to retrieve the needed information. I'm not sure, if this in general is faster than opening the second DB three times. The first and the second solution seem to be equivalent concerning general performance.
The query infor for the first one and the second one is attached below:
Query: for $i at $p in //entry[phraseme/text() = "Ad0194" and selected/text() = "yes"] let $query := $i/query let $node := $i/node let $prefix := fn:in-scope-prefixes($i) let $title := (db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI[1]//*:fileDesc[1]//*:titleStmt)[1]//*:title[1] let $author := (db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI[1]//*:sourceDesc)[1]//*:bibl[1]//*:author[1] let $note :=( db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI)[1]//*:notesStmt//*:note let $expr := concat("ft:mark(db:open-id('TG-DTA-GerManC-stemming-ws', ", $node, ") ", $query, ")") let $time := data($i/@time) return <div> <hit count="{ $p}"> <p><input type="checkbox" name="NODE" value="{$node}"/><b class="hitno">{$p} ({ if($prefix = "dta") then "DTA" else "TG"})</b>Knoten: {$i/node}</p> {xquery:eval($expr)} </hit> <bib> <p class="bibl"><b>{$time}</b><br/><b>Bibliographie</b> { data($author)}: { data($title)} <br/><b>Anmerkung</b>: { data ($note) }<br/> <b>Korpus</b>: { if($prefix = "dta") then "Deutsches Textarchiv" else "TextGrid Digitale Bibliothek"}</p> </bib> <p></p></div>
Compiling: - rewriting And expression to predicate(s) - rewriting fn:boolean(phraseme/text() = "Ad0194") - rewriting fn:boolean(selected/text() = "yes") - simplifying descendant-or-self step(s) - applying text index - simplifying descendant-or-self step(s) - simplifying descendant-or-self step(s) - simplifying descendant-or-self step(s)
Result: for $i at $p as xs:integer in db:text("collect-ws", "Ad0194")/parent::phraseme/parent::entry[selected/text() = "yes"] let $query := $i/query let $node := $i/node let $prefix := fn:in-scope-prefixes($i) let $title := (db:open-id("TG-DTA-GerManC-stemming-ws", $node)/ancestor::*:TEI[1]/descendant-or-self::node()/*:fileDesc[1]/descendant::*:titleStmt)[1]/descendant-or-self::node()/*:title[1] let $author := (db:open-id("TG-DTA-GerManC-stemming-ws", $node)/ancestor::*:TEI[1]/descendant::*:sourceDesc)[1]/descendant-or-self::node()/*:bibl[1]/descendant-or-self::node()/*:author[1] let $note := (db:open-id("TG-DTA-GerManC-stemming-ws", $node)/ancestor::*:TEI)[1]/descendant::*:notesStmt/descendant::*:note let $expr := fn:concat("ft:mark(db:open-id('TG-DTA-GerManC-stemming-ws', ", $node, ") ", $query, ")") let $time := fn:data($i/@time) return element div { element hit { attribute count { $p }, element p { element input { attribute type { "checkbox" }, attribute name { "NODE" }, attribute value { $node } }, element b { attribute class { "hitno" }, $p, " (", if($prefix = "dta") then "DTA" else "TG", ")" }, "Knoten: ", $i/node }, xquery:eval($expr) }, element bib { element p { attribute class { "bibl" }, element b { $time }, element br { () }, element b { "Bibliographie" }, fn:data($author), ": ", fn:data($title), element br { () }, element b { "Anmerkung" }, ": ", fn:data($note), element br { () }, element b { "Korpus" }, ": ", if($prefix = "dta") then "Deutsches Textarchiv" else "TextGrid Digitale Bibliothek" } }, element p { () } }
Timing: - Parsing: 1.89 ms - Compiling: 5.5 ms - Evaluating: 5697.76 ms - Printing: 38.35 ms - Total Time: 5743.51 ms
Result: - Hit(s): 676 Items - Updated: 0 Items - Printed: 2048 KB
Query plan: <QueryPlan> <FLWR> <For var="$i" pos="$p as xs:integer"> <AxisPath> <ValueAccess data="collect-ws" type="TEXT"> <Str value="Ad0194" type="xs:string"/> </ValueAccess> <IterStep axis="parent" test="phraseme"/> <IterStep axis="parent" test="entry"> <CmpG op="="> <AxisPath> <IterStep axis="child" test="selected"/> <IterStep axis="child" test="text()"/> </AxisPath> <Str value="yes" type="xs:string"/> </CmpG> </IterStep> </AxisPath> </For> <Let var="$query"> <AxisPath> <VarRef> <Var name="$i" id="0"/> </VarRef> <IterStep axis="child" test="query"/> </AxisPath> </Let> <Let var="$node"> <AxisPath> <VarRef> <Var name="$i" id="0"/> </VarRef> <IterStep axis="child" test="node"/> </AxisPath> </Let> <Let var="$prefix"> <FNQName name="in-scope-prefixes(elem)"> <VarRef> <Var name="$i" id="0"/> </VarRef> </FNQName> </Let> <Let var="$title"> <AxisPath> <IterPosFilter> <AxisPath> <FNDb name="open-id(database,id)"> <Str value="TG-DTA-GerManC-stemming-ws" type="xs:string"/> <VarRef> <Var name="$node" id="3"/> </VarRef> </FNDb> <IterPosStep axis="ancestor" test="*:TEI"> <Pos min="1" max="1"/> </IterPosStep> <IterStep axis="descendant-or-self" test="node()"/> <IterPosStep axis="child" test="*:fileDesc"> <Pos min="1" max="1"/> </IterPosStep> <IterStep axis="descendant" test="*:titleStmt"/> </AxisPath> <Pos min="1" max="1"/> </IterPosFilter> <IterStep axis="descendant-or-self" test="node()"/> <IterPosStep axis="child" test="*:title"> <Pos min="1" max="1"/> </IterPosStep> </AxisPath> </Let> <Let var="$author"> <AxisPath> <IterPosFilter> <AxisPath> <FNDb name="open-id(database,id)"> <Str value="TG-DTA-GerManC-stemming-ws" type="xs:string"/> <VarRef> <Var name="$node" id="3"/> </VarRef> </FNDb> <IterPosStep axis="ancestor" test="*:TEI"> <Pos min="1" max="1"/> </IterPosStep> <IterStep axis="descendant" test="*:sourceDesc"/> </AxisPath> <Pos min="1" max="1"/> </IterPosFilter> <IterStep axis="descendant-or-self" test="node()"/> <IterPosStep axis="child" test="*:bibl"> <Pos min="1" max="1"/> </IterPosStep> <IterStep axis="descendant-or-self" test="node()"/> <IterPosStep axis="child" test="*:author"> <Pos min="1" max="1"/> </IterPosStep> </AxisPath> </Let> <Let var="$note"> <AxisPath> <IterPosFilter> <AxisPath> <FNDb name="open-id(database,id)"> <Str value="TG-DTA-GerManC-stemming-ws" type="xs:string"/> <VarRef> <Var name="$node" id="3"/> </VarRef> </FNDb> <IterStep axis="ancestor" test="*:TEI"/> </AxisPath> <Pos min="1" max="1"/> </IterPosFilter> <IterStep axis="descendant" test="*:notesStmt"/> <IterStep axis="descendant" test="*:note"/> </AxisPath> </Let> <Let var="$expr"> <FNStr name="concat(atom,atom[,...])"> <Str value="ft:mark(db:open-id('TG-DTA-GerManC-stemming-ws', " type="xs:string"/> <VarRef> <Var name="$node" id="3"/> </VarRef> <Str value=") " type="xs:string"/> <VarRef> <Var name="$query" id="2"/> </VarRef> <Str value=")" type="xs:string"/> </FNStr> </Let> <Let var="$time"> <FNGen name="data([item])"> <AxisPath> <VarRef> <Var name="$i" id="0"/> </VarRef> <IterStep axis="attribute" test="time"/> </AxisPath> </FNGen> </Let> <Return> <CElem> <QNm value="div" type="xs:QName"/> <CElem> <QNm value="hit" type="xs:QName"/> <CAttr> <QNm value="count" type="xs:QName"/> <VarRef> <Var name="$p as xs:integer" id="1"/> </VarRef> </CAttr> <CElem> <QNm value="p" type="xs:QName"/> <CElem> <QNm value="input" type="xs:QName"/> <CAttr> <QNm value="type" type="xs:QName"/> <Str value="checkbox" type="xs:string"/> </CAttr> <CAttr> <QNm value="name" type="xs:QName"/> <Str value="NODE" type="xs:string"/> </CAttr> <CAttr> <QNm value="value" type="xs:QName"/> <VarRef> <Var name="$node" id="3"/> </VarRef> </CAttr> </CElem> <CElem> <QNm value="b" type="xs:QName"/> <CAttr> <QNm value="class" type="xs:QName"/> <Str value="hitno" type="xs:string"/> </CAttr> <VarRef> <Var name="$p as xs:integer" id="1"/> </VarRef> <Str value=" (" type="xs:string"/> <If> <CmpG op="="> <VarRef> <Var name="$prefix" id="4"/> </VarRef> <Str value="dta" type="xs:string"/> </CmpG> <Str value="DTA" type="xs:string"/> <Str value="TG" type="xs:string"/> </If> <Str value=")" type="xs:string"/> </CElem> <Str value="Knoten: " type="xs:string"/> <AxisPath> <VarRef> <Var name="$i" id="0"/> </VarRef> <IterStep axis="child" test="node"/> </AxisPath> </CElem> <FNXQuery name="eval(string[,bindings])"> <VarRef> <Var name="$expr" id="8"/> </VarRef> </FNXQuery> </CElem> <CElem> <QNm value="bib" type="xs:QName"/> <CElem> <QNm value="p" type="xs:QName"/> <CAttr> <QNm value="class" type="xs:QName"/> <Str value="bibl" type="xs:string"/> </CAttr> <CElem> <QNm value="b" type="xs:QName"/> <VarRef> <Var name="$time" id="9"/> </VarRef> </CElem> <CElem> <QNm value="br" type="xs:QName"/> </CElem> <CElem> <QNm value="b" type="xs:QName"/> <Str value="Bibliographie" type="xs:string"/> </CElem> <FNGen name="data([item])"> <VarRef> <Var name="$author" id="6"/> </VarRef> </FNGen> <Str value=": " type="xs:string"/> <FNGen name="data([item])"> <VarRef> <Var name="$title" id="5"/> </VarRef> </FNGen> <CElem> <QNm value="br" type="xs:QName"/> </CElem> <CElem> <QNm value="b" type="xs:QName"/> <Str value="Anmerkung" type="xs:string"/> </CElem> <Str value=": " type="xs:string"/> <FNGen name="data([item])"> <VarRef> <Var name="$note" id="7"/> </VarRef> </FNGen> <CElem> <QNm value="br" type="xs:QName"/> </CElem> <CElem> <QNm value="b" type="xs:QName"/> <Str value="Korpus" type="xs:string"/> </CElem> <Str value=": " type="xs:string"/> <If> <CmpG op="="> <VarRef> <Var name="$prefix" id="4"/> </VarRef> <Str value="dta" type="xs:string"/> </CmpG> <Str value="Deutsches Textarchiv" type="xs:string"/> <Str value="TextGrid Digitale Bibliothek" type="xs:string"/> </If> </CElem> </CElem> <CElem> <QNm value="p" type="xs:QName"/> </CElem> </CElem> </Return> </FLWR> </QueryPlan>
#############################################################
Query: for $i at $p in //entry[phraseme/text() = "Ad0194" and selected/text() = "yes"] let $query := $i/query let $node := $i/node let $prefix := fn:in-scope-prefixes($i) let $tei := (db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI)[1] let $title := ($tei//*:fileDesc)[1]//*:titleStmt[1]//*:title[1] let $author := ($tei//*:sourceDesc)[1]//*:bibl[1]//*:author[1] let $note := $tei//*:notesStmt//*:note let $expr := concat("ft:mark(db:open-id('TG-DTA-GerManC-stemming-ws', ", $node, ") ", $query, ")") let $time := data($i/@time) return <div> <hit count="{ $p}"> <p><input type="checkbox" name="NODE" value="{$node}"/><b class="hitno">{$p} ({ if($prefix = "dta") then "DTA" else "TG"})</b>Knoten: {$i/node}</p> {xquery:eval($expr)} </hit> <bib> <p class="bibl"><b>{$time}</b><br/><b>Bibliographie</b> { data($author)}: { data($title)} <br/><b>Anmerkung</b>: { data ($note) }<br/> <b>Korpus</b>: { if($prefix = "dta") then "Deutsches Textarchiv" else "TextGrid Digitale Bibliothek"}</p> </bib> <p></p></div>
Compiling: - rewriting And expression to predicate(s) - rewriting fn:boolean(phraseme/text() = "Ad0194") - rewriting fn:boolean(selected/text() = "yes") - simplifying descendant-or-self step(s) - applying text index - simplifying descendant-or-self step(s) - simplifying descendant-or-self step(s) - simplifying descendant-or-self step(s)
Result: for $i at $p as xs:integer in db:text("collect-ws", "Ad0194")/parent::phraseme/parent::entry[selected/text() = "yes"] let $query := $i/query let $node := $i/node let $prefix := fn:in-scope-prefixes($i) let $tei := (db:open-id("TG-DTA-GerManC-stemming-ws", $node)/ancestor::*:TEI)[1] let $title := ($tei/descendant::*:fileDesc)[1]/descendant-or-self::node()/*:titleStmt[1]/descendant-or-self::node()/*:title[1] let $author := ($tei/descendant::*:sourceDesc)[1]/descendant-or-self::node()/*:bibl[1]/descendant-or-self::node()/*:author[1] let $note := $tei/descendant::*:notesStmt/descendant::*:note let $expr := fn:concat("ft:mark(db:open-id('TG-DTA-GerManC-stemming-ws', ", $node, ") ", $query, ")") let $time := fn:data($i/@time) return element div { element hit { attribute count { $p }, element p { element input { attribute type { "checkbox" }, attribute name { "NODE" }, attribute value { $node } }, element b { attribute class { "hitno" }, $p, " (", if($prefix = "dta") then "DTA" else "TG", ")" }, "Knoten: ", $i/node }, xquery:eval($expr) }, element bib { element p { attribute class { "bibl" }, element b { $time }, element br { () }, element b { "Bibliographie" }, fn:data($author), ": ", fn:data($title), element br { () }, element b { "Anmerkung" }, ": ", fn:data($note), element br { () }, element b { "Korpus" }, ": ", if($prefix = "dta") then "Deutsches Textarchiv" else "TextGrid Digitale Bibliothek" } }, element p { () } }
Timing: - Parsing: 3.01 ms - Compiling: 3.44 ms - Evaluating: 5180.53 ms - Printing: 59.07 ms - Total Time: 5246.06 ms
Result: - Hit(s): 676 Items - Updated: 0 Items - Printed: 2048 KB
Query plan: <QueryPlan> <FLWR> <For var="$i" pos="$p as xs:integer"> <AxisPath> <ValueAccess data="collect-ws" type="TEXT"> <Str value="Ad0194" type="xs:string"/> </ValueAccess> <IterStep axis="parent" test="phraseme"/> <IterStep axis="parent" test="entry"> <CmpG op="="> <AxisPath> <IterStep axis="child" test="selected"/> <IterStep axis="child" test="text()"/> </AxisPath> <Str value="yes" type="xs:string"/> </CmpG> </IterStep> </AxisPath> </For> <Let var="$query"> <AxisPath> <VarRef> <Var name="$i" id="0"/> </VarRef> <IterStep axis="child" test="query"/> </AxisPath> </Let> <Let var="$node"> <AxisPath> <VarRef> <Var name="$i" id="0"/> </VarRef> <IterStep axis="child" test="node"/> </AxisPath> </Let> <Let var="$prefix"> <FNQName name="in-scope-prefixes(elem)"> <VarRef> <Var name="$i" id="0"/> </VarRef> </FNQName> </Let> <Let var="$tei"> <IterPosFilter> <AxisPath> <FNDb name="open-id(database,id)"> <Str value="TG-DTA-GerManC-stemming-ws" type="xs:string"/> <VarRef> <Var name="$node" id="3"/> </VarRef> </FNDb> <IterStep axis="ancestor" test="*:TEI"/> </AxisPath> <Pos min="1" max="1"/> </IterPosFilter> </Let> <Let var="$title"> <AxisPath> <IterPosFilter> <AxisPath> <VarRef> <Var name="$tei" id="5"/> </VarRef> <IterStep axis="descendant" test="*:fileDesc"/> </AxisPath> <Pos min="1" max="1"/> </IterPosFilter> <IterStep axis="descendant-or-self" test="node()"/> <IterPosStep axis="child" test="*:titleStmt"> <Pos min="1" max="1"/> </IterPosStep> <IterStep axis="descendant-or-self" test="node()"/> <IterPosStep axis="child" test="*:title"> <Pos min="1" max="1"/> </IterPosStep> </AxisPath> </Let> <Let var="$author"> <AxisPath> <IterPosFilter> <AxisPath> <VarRef> <Var name="$tei" id="5"/> </VarRef> <IterStep axis="descendant" test="*:sourceDesc"/> </AxisPath> <Pos min="1" max="1"/> </IterPosFilter> <IterStep axis="descendant-or-self" test="node()"/> <IterPosStep axis="child" test="*:bibl"> <Pos min="1" max="1"/> </IterPosStep> <IterStep axis="descendant-or-self" test="node()"/> <IterPosStep axis="child" test="*:author"> <Pos min="1" max="1"/> </IterPosStep> </AxisPath> </Let> <Let var="$note"> <AxisPath> <VarRef> <Var name="$tei" id="5"/> </VarRef> <IterStep axis="descendant" test="*:notesStmt"/> <IterStep axis="descendant" test="*:note"/> </AxisPath> </Let> <Let var="$expr"> <FNStr name="concat(atom,atom[,...])"> <Str value="ft:mark(db:open-id('TG-DTA-GerManC-stemming-ws', " type="xs:string"/> <VarRef> <Var name="$node" id="3"/> </VarRef> <Str value=") " type="xs:string"/> <VarRef> <Var name="$query" id="2"/> </VarRef> <Str value=")" type="xs:string"/> </FNStr> </Let> <Let var="$time"> <FNGen name="data([item])"> <AxisPath> <VarRef> <Var name="$i" id="0"/> </VarRef> <IterStep axis="attribute" test="time"/> </AxisPath> </FNGen> </Let> <Return> <CElem> <QNm value="div" type="xs:QName"/> <CElem> <QNm value="hit" type="xs:QName"/> <CAttr> <QNm value="count" type="xs:QName"/> <VarRef> <Var name="$p as xs:integer" id="1"/> </VarRef> </CAttr> <CElem> <QNm value="p" type="xs:QName"/> <CElem> <QNm value="input" type="xs:QName"/> <CAttr> <QNm value="type" type="xs:QName"/> <Str value="checkbox" type="xs:string"/> </CAttr> <CAttr> <QNm value="name" type="xs:QName"/> <Str value="NODE" type="xs:string"/> </CAttr> <CAttr> <QNm value="value" type="xs:QName"/> <VarRef> <Var name="$node" id="3"/> </VarRef> </CAttr> </CElem> <CElem> <QNm value="b" type="xs:QName"/> <CAttr> <QNm value="class" type="xs:QName"/> <Str value="hitno" type="xs:string"/> </CAttr> <VarRef> <Var name="$p as xs:integer" id="1"/> </VarRef> <Str value=" (" type="xs:string"/> <If> <CmpG op="="> <VarRef> <Var name="$prefix" id="4"/> </VarRef> <Str value="dta" type="xs:string"/> </CmpG> <Str value="DTA" type="xs:string"/> <Str value="TG" type="xs:string"/> </If> <Str value=")" type="xs:string"/> </CElem> <Str value="Knoten: " type="xs:string"/> <AxisPath> <VarRef> <Var name="$i" id="0"/> </VarRef> <IterStep axis="child" test="node"/> </AxisPath> </CElem> <FNXQuery name="eval(string[,bindings])"> <VarRef> <Var name="$expr" id="9"/> </VarRef> </FNXQuery> </CElem> <CElem> <QNm value="bib" type="xs:QName"/> <CElem> <QNm value="p" type="xs:QName"/> <CAttr> <QNm value="class" type="xs:QName"/> <Str value="bibl" type="xs:string"/> </CAttr> <CElem> <QNm value="b" type="xs:QName"/> <VarRef> <Var name="$time" id="10"/> </VarRef> </CElem> <CElem> <QNm value="br" type="xs:QName"/> </CElem> <CElem> <QNm value="b" type="xs:QName"/> <Str value="Bibliographie" type="xs:string"/> </CElem> <FNGen name="data([item])"> <VarRef> <Var name="$author" id="7"/> </VarRef> </FNGen> <Str value=": " type="xs:string"/> <FNGen name="data([item])"> <VarRef> <Var name="$title" id="6"/> </VarRef> </FNGen> <CElem> <QNm value="br" type="xs:QName"/> </CElem> <CElem> <QNm value="b" type="xs:QName"/> <Str value="Anmerkung" type="xs:string"/> </CElem> <Str value=": " type="xs:string"/> <FNGen name="data([item])"> <VarRef> <Var name="$note" id="8"/> </VarRef> </FNGen> <CElem> <QNm value="br" type="xs:QName"/> </CElem> <CElem> <QNm value="b" type="xs:QName"/> <Str value="Korpus" type="xs:string"/> </CElem> <Str value=": " type="xs:string"/> <If> <CmpG op="="> <VarRef> <Var name="$prefix" id="4"/> </VarRef> <Str value="dta" type="xs:string"/> </CmpG> <Str value="Deutsches Textarchiv" type="xs:string"/> <Str value="TextGrid Digitale Bibliothek" type="xs:string"/> </If> </CElem> </CElem> <CElem> <QNm value="p" type="xs:QName"/> </CElem> </CElem> </Return> </FLWR> </QueryPlan>