I did a testing related to xquery update

It took 170 sec to insert a node into wikipedia xml database. Is there a faster way of doing it?

insert node <d/> after (fn:doc ("enwiki-latest-pages-articles")//*:page[w:title contains text "AccessibleComputing"] ) [1]


Query: declare namespace w="http://www.mediawiki.org/xml/export-0.5/";
Compiling:
- pre-evaluating fn:doc("enwiki-latest-pages-articles")
- optimizing descendant-or-self step(s)
Result: insert node element { "d" } { () } into (document-node { "enwiki-latest-pages-articles.xml" }/descendant::*:page[w:title contains text "AccessibleComputing"])[position() = 1]
Timing:
 - Parsing:  0.27 ms
 - Compiling:  167.38 ms
 - Evaluating:  170264.31 ms
 - Printing:  45.12 ms
 - Total Time:  170477.1 ms
Query plan:
<Insert>
  <IterPosFilter>
    <IterPath>
      <DBNode name="enwiki-latest-pages-articles"/>
      <IterStep axis="descendant" test="*:page">
        <FTContains>
          <AxisPath>
            <IterStep axis="child" test="w:title"/>
          </AxisPath>
          <FTWords>
            <Item value="AccessibleComputing" type="xs:string"/>
          </FTWords>
        </FTContains>
      </IterStep>
    </IterPath>
    <Pos min="1" max="1"/>
  </IterPosFilter>
  <CElem>
    <Item value="d" type="xs:QName"/>
  </CElem>
</Insert>



On Mon, Apr 4, 2011 at 7:31 AM, Erol Akarsu <eakarsu@gmail.com> wrote:
I imported  wikipedia xml into basex and tried to search it.

But searching it takes longer.

I tried to search one element that is first child of whole document and it took 52 sec.
I know the XML file is very big 31GB. How can I optimize the search?

declare namespace w="http://www.mediawiki.org/xml/export-0.5/";

let $d := fn:doc ("enwiki-latest-pages-articles")//w:siteinfo
return $d

Database info:

> open enwiki-latest-pages-articles
Database 'enwiki-latest-pages-articles' opened in 778.49 ms.
> info database
Database Properties
 Name: enwiki-latest-pages-articles
 Size: 23356 MB
 Nodes: 228090153
 Height: 6

Database Creation
 Path: /mnt/hgfs/C/tmp/enwiki-latest-pages-articles.xml
 Time Stamp: 03.04.2011 12:29:15
 Input Size: 30025 MB
 Encoding: UTF-8
 Documents: 1
 Whitespace Chopping: ON
 Entity Parsing: OFF

Indexes
 Up-to-date: true
 Path Summary: ON
 Text Index: ON
 Attribute Index: ON
 Full-Text Index: OFF
>


Timing info:

Query: declare namespace w="http://www.mediawiki.org/xml/export-0.5/";
Compiling:
- pre-evaluating fn:doc("enwiki-latest-pages-articles")
- optimizing descendant-or-self step(s)
- binding static variable $d
- removing variable $d
- simplifying flwor expression
Result: element siteinfo { ... }
Timing:
 - Parsing:  1.4 ms
 - Compiling:  52599.0 ms
 - Evaluating:  0.28 ms
 - Printing:  0.62 ms
 - Total Time:  52601.32 ms
Query plan:
<DBNode name="enwiki-latest-pages-articles" pre="5"/>



Result of query:

<siteinfo xmlns="http://www.mediawiki.org/xml/export-0.5/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <sitename>Wikipedia</sitename>
  <base>http://en.wikipedia.org/wiki/Main_Page</base>
  <generator>MediaWiki 1.17wmf1</generator>
  <case>first-letter</case>
  <namespaces>
    <namespace key="-2" case="first-letter">Media</namespace>
    <namespace key="-1" case="first-letter">Special</namespace>
    <namespace key="0" case="first-letter"/>
    <namespace key="1" case="first-letter">Talk</namespace>
    <namespace key="2" case="first-letter">User</namespace>
    <namespace key="3" case="first-letter">User talk</namespace>
    <namespace key="4" case="first-letter">Wikipedia</namespace>
    <namespace key="5" case="first-letter">Wikipedia talk</namespace>
    <namespace key="6" case="first-letter">File</namespace>
    <namespace key="7" case="first-letter">File talk</namespace>
    <namespace key="8" case="first-letter">MediaWiki</namespace>
    <namespace key="9" case="first-letter">MediaWiki talk</namespace>
    <namespace key="10" case="first-letter">Template</namespace>
    <namespace key="11" case="first-letter">Template talk</namespace>
    <namespace key="12" case="first-letter">Help</namespace>
    <namespace key="13" case="first-letter">Help talk</namespace>
    <namespace key="14" case="first-letter">Category</namespace>
    <namespace key="15" case="first-letter">Category talk</namespace>
    <namespace key="100" case="first-letter">Portal</namespace>
    <namespace key="101" case="first-letter">Portal talk</namespace>
    <namespace key="108" case="first-letter">Book</namespace>
    <namespace key="109" case="first-letter">Book talk</namespace>
  </namespaces>
</siteinfo>



Thanks

Erol Akarsu