Am 19.11.2012 um 23:00 schrieb Christian Grün:
let $title := (db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI[1]//*:fileDesc)[1]//*:titleStmt[1]//*:title[1]
Do you think that the following query would return the expected result?
db:open-id('TG-DTA-GerManC-stemming-ws', $node)/ ancestor::*:TEI[1]/ descendant::*:fileDesc[1]/ descendant::*:titleStmt[1]/ descendant::*:title[1]
If yes, it may be the fastest version (can’t promise, though).
No and no.
These queries are equivalent for the result:
let $title := db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI[1]//*:fileDesc[1]//*:titleStmt[1]//*:title[1] let $title := (db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI)[1]//*:fileDesc[1]//*:titleStmt[1]//*:title[1] let $title := (db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI[1]//*:fileDesc)[1]//*:titleStmt[1]//*:title[1]
And your proposal is equivalent to:
let $title := (db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI[1]//*:fileDesc[1]//*:titleStmt)[1]//*:title[1]
What is more, your proposal takes more than double the time.
If no, could you try to specify what the given query is supposed to return in natural language?
The hit I am interested in, is a <p> or <l> node. This node belongs to a specific TEI document representing a certain novel or poem or the like. The DB consists of several thousands such documents. Some TEI documents are nested, i.e., a book representing a collection of poems, so the book is a TEI document and each poem is one, too. I need the bibliographic information of the node.
And then of course, the TEI documents are structured differently. See the xml excerpts at the end (only the <fileDesc> node from the TEI header). The texts come from three main resources, so it might be useful to have a dedicated collection for each of them. However, as the documents within each main resource aren't annotated consistently, there is not much sense to do it. TEI allows to store any information at any place you want, take the author information as example:
a) <author>Ortensio Mauro</author>
in <titleStmt> in <fileDesc>
b) <author> <name key="PND:118648071"> <surname>Alexis</surname> <forename>Willibald</forename> </name> </author>
in <titleStmt> in <fileDesc> _and_ in <itleStmt> in <biblFull> in<sourceDesc>
c) <author key="pnd:11850021X">Abraham a Sancta Clara</author>
in <itleStmt> in <biblFull> in<sourceDesc>, sometimes in this order, sometimes as "name, firstname"
That's what people call Digital Humanities ...
So from a <p> or <l> node I go upwards until I find the first TEI node (/ancestor::*:TEI[1]).
From there, I travel down until I find the first <fileDesc>, and somewhere in there the first <titleStmt> and somewhere in there the first <title> node. And this I use as title.
<fileDesc> <titleStmt> <title>Judas der Erzschelm</title> </titleStmt>
<publicationStmt> <idno type="FileCreationTime">Abraham a Sancta Clara: Element 00008 [2011/07/11 at 20:28:21]</idno> <availability> <p> Der annotierte Datenbestand der Digitalen Bibliothek inklusive Metadaten sowie davon einzeln zugängliche Teile sind eine Abwandlung des Datenbestandes von www.editura.de durch TextGrid und werden unter der Lizenz Creative Commons Namensnennung 3.0 Deutschland Lizenz (by-Nennung TextGrid, www.editura.de) veröffentlicht. Die Lizenz bezieht sich nicht auf die der Annotation zu Grunde liegenden allgemeinfreien Texte (Siehe auch Punkt 2 der Lizenzbestimmungen). </p> <p> <ref target="http://creativecommons.org/licenses/by/3.0/de/legalcode">Lizenzvertrag</ref> </p> <p> <ref target="http://creativecommons.org/licenses/by/3.0/de/"> Eine vereinfachte Zusammenfassung des rechtsverbindlichen Lizenzvertrages in allgemeinverständlicher Sprache </ref> </p> <p> <ref target="http://www.textgrid.de/Digitale-Bibliothek">Hinweise zur Lizenz und zur Digitalen Bibliothek</ref> </p> </availability> </publicationStmt>
<notesStmt> <note> Erstdruck: Salzburg (Haan) 1686, mit kaiserlichem Privileg datiert auf den 25. September 1685, Band 1: 1686; Band 2: 1689; Band 3: 1692; Band 4: 1695. </note> </notesStmt>
<sourceDesc> <biblFull> <titleStmt> <title>Abraham a Sancta Clara: Judas der Erzschelm für ehrliche Leutߣ, oder eigentlicher Entwurf und Lebensbeschreibung des Iscariotischen Böswicht. 7 Bände, in: Abraham a St. Claraߣs Sämmtliche Werke, Band 1, Passau: Friedrich Winkler, 1834–1836.</title> <author key="pnd:11850021X">Abraham a Sancta Clara</author> </titleStmt>
<extent>0-</extent>
<publicationStmt> <date notBefore="1834" notAfter="1836"/> <pubPlace>Passau</pubPlace> </publicationStmt> </biblFull> </sourceDesc> </fileDesc>
#######################################
<fileDesc> <titleStmt> <title type="main">Ruhe ist die erste Bürgerpflicht oder Vor fünfzig Jahren</title> <title type="sub">Vaterländischer Roman</title> <title type="vol" n="1">Erster Band</title> <author> <name key="PND:118648071"> <surname>Alexis</surname> <forename>Willibald</forename> </name> </author> <respStmt corresp="#DTA-Corpus-Publisher"> <name>Marko Drotschmann, Oliver Duntze, Christiane Fritze, Alexander Geyken, Bryan Jurish, Alexander Siebert</name> <resp>conversion to XML/TEI-conformant markup</resp> </respStmt> </titleStmt>
<extent> <measure type="token"/> </extent>
<publicationStmt> <publisher xml:id="DTA-Corpus-Publisher">Deutsches Textarchiv</publisher> <address> <addrLine>Jägerstr. 22, 10117 Berlin</addrLine> <addrLine>dta@bbaw.de</addrLine> </address> <pubPlace>Berlin</pubPlace> <date>2011-05-04 14:53</date> <availability n="OR3P" status="free"> <p>This text is available under Creative Commons license CC-BY</p> </availability> <idno type="URN">urn:nbn:de:kobv:b4-2009051900</idno> <idno type="DTAID">16518</idno> </publicationStmt>
<sourceDesc n="orig">
<bibl>Alexis, Willibald: Ruhe ist die erste Bürgerpflicht. Bd. 1. Berlin: Barthol, 1852.</bibl>
<biblFull> <titleStmt> <title level="m" type="main">Ruhe ist die erste Bürgerpflicht oder Vor fünfzig Jahren</title> <title level="m" type="sub">Vaterländischer Roman</title> <title level="m" type="vol" n="1">Erster Band</title> <author> <name key="PND:118648071"> <surname>Alexis</surname> <forename>Willibald</forename> </name> </author> </titleStmt> <extent> <measure type="pages" n="356">VIII, 348 S.</measure> </extent> <publicationStmt> <publisher>Barthol</publisher> <pubPlace>Berlin</pubPlace> <date type="first">1852</date> </publicationStmt> <notesStmt> <note type="identifier"> <ident type="epn">876060645</ident> </note> <note type="location"> <name type="repository">Staatsbibliothek zu Berlin - PK</name> <ident type="shelfmark"/> </note> 0 <note type="pub_type">monograph</note> </notesStmt> </biblFull>
<listPerson type="searchNames"> <person><persName>Willibald Alexis</persName></person> </listPerson> </sourceDesc> </fileDesc>
############################
<fileDesc> <titleStmt> <title>Der in seiner Freyheit vergnuͤgte <hi rend="antiqua">ALCIBIADES, In einem Sing-Spiel vorgestellet Auf dem Braunschweigischen Schau-Platz [...]</hi></title> <author>Ortensio Mauro</author> </titleStmt>
<publicationStmt> <pubPlace>Braunschweig</pubPlace> <date>1700</date> </publicationStmt>
<notesStmt> <note type="filename">DRAM_P1_NoD_1700_Freyheit</note> <note type="region">North German</note> <note type="genre">Drama</note> <note type="period"><date>1650-1700</date></note> <note type="extract"><bibl>Act I, Scene 1-Act II, Scene 5</bibl></note> </notesStmt>
<sourceDesc> <p>Extract taken from the digital collection of the Herzog-August-Bibliothek Wolfenbüttel: http://diglib.hab.de/drucke/textb-31/start.htm</p> </sourceDesc> </fileDesc>