DB Contention

List overview All Threads
Download

newer

older

Concurrency model

Question about baseX...

buddyonweb-software＠yahoo.com

3 Dec 2015 3 Dec '15

3:18 p.m.

I have a Java process that continually scans for incoming .xml files deposited into a system folder on the OS. When it finds a file (say a.xml), it will create a DB for that file (call it DB-A) and load the XML file into that database (DB-A).Note: DB names will be guaranteed to be unique when created, so there will never be 2 DB-A databases created. As this file is processed by a seondar, other XML files will be added into DB-A - but all documents will relate only to the processing of the a.xml file (for example, various statistics, etc.). Now, there is a secondary/separate java process, that scans the database instances that have been created with the XML file loaded. This secondary process does some querying on that file and adds additional XML documents to that database. That is only this process will add new documents to the database.

So my question is this: 1. With process 1 creating and inserting the original .xml file, is there a chance for database contention or is this architecture pretty safe from contention? Note: both Java process are simply using the BaseX.jar file. 2. If I added a third Java process in the future, that would a) only access existing documents in read only mode b) could add new documents to the database that no other process would read or update, is this safe from contention?

Thanks in advance.

Attachments:

attachment.html (text/html — 2.4 KB)

Show replies by date

Christian Grün

4 Dec 4 Dec

6 a.m.

Hi Buddy on web,

For all your questions, I can probably give you a short answer: you will need to use the client/server architecture of BaseX if you want to concurrently read and update one database. For general information on our transaction management, you could have a look at our Wiki article [1].

Hope this helps, Christian

[1] http://docs.basex.org/wiki/Transaction_Management

On Thu, Dec 3, 2015 at 9:18 PM, buddyonweb-software@yahoo.com wrote:

...

I have a Java process that continually scans for incoming .xml files deposited into a system folder on the OS. When it finds a file (say a.xml), it will create a DB for that file (call it DB-A) and load the XML file into that database (DB-A). Note: DB names will be guaranteed to be unique when created, so there will never be 2 DB-A databases created.

As this file is processed by a seondar, other XML files will be added into DB-A - but all documents will relate only to the processing of the a.xml file (for example, various statistics, etc.).

Now, there is a secondary/separate java process, that scans the database instances that have been created with the XML file loaded. This secondary process does some querying on that file and adds additional XML documents to that database. That is only this process will add new documents to the database.

So my question is this:

With process 1 creating and inserting the original .xml file, is there a

chance for database contention or is this architecture pretty safe from contention? Note: both Java process are simply using the BaseX.jar file.

If I added a third Java process in the future, that would a) only access

existing documents in read only mode b) could add new documents to the database that no other process would read or update, is this safe from contention?

Thanks in advance.

Ron Katriel

10 Dec 10 Dec

10:33 a.m.

New subject: Returning text matches

Hi,

Is there a way to return all matches when searching a large XML structure? For example, return the genomic keywords that matched anywhere in $study using the following query:

for $study in db:open('CTGov')/clinical_study let $result := $study contains text { 'genomics', 'genomic', 'transcriptome', 'exome', 'whole genome', 'microarray', 'proteome', 'metabolome' } let score $score := $result where $score >= 0.01 return $study/id_info/nct_id (: this is just the Study ID :)

Ideally it would include an indication of where in the tree the matches are (e.g., that ‘exome’ was found in $study/official_title and in $article/keywords).

This could presumably be done using regular expression matching (after serializing the tree into a text string) but it does not seem an elegant solution.

Thanks, Ron

Christian Grün

11:24 a.m.

New subject: Returning text matches

Hi Ron,

You can use ft:mark and ft:extract to highlights matches in a full-text result [1].

Hope this helps, Christian

[1] http://docs.basex.org/wiki/Full-Text_Module#ft:mark

On Thu, Dec 10, 2015 at 4:33 PM, Ron Katriel rkatriel@mdsol.com wrote:

...

Hi,

Is there a way to return all matches when searching a large XML structure? For example, return the genomic keywords that matched anywhere in $study using the following query:

for $study in db:open('CTGov')/clinical_study let $result := $study contains text { 'genomics', 'genomic', 'transcriptome', 'exome', 'whole genome', 'microarray', 'proteome', 'metabolome' } let score $score := $result where $score >= 0.01 return $study/id_info/nct_id (: this is just the Study ID :)

Ideally it would include an indication of where in the tree the matches are (e.g., that ‘exome’ was found in $study/official_title and in $article/keywords).

This could presumably be done using regular expression matching (after serializing the tree into a text string) but it does not seem an elegant solution.

Thanks, Ron

Ron Katriel

12:41 p.m.

New subject: Returning text matches

Thanks, Christian. The following works as expected (the output contains the matches with their surrounding context)

for $study in db:open('CTGov')/clinical_study let $result := $study contains text { 'genomics', 'genomic', 'transcriptome', 'exome', 'whole genome', 'microarray', 'proteome', 'metabolome' } let score $score := $result where $score >= 0.01 return ft:extract($study//*[text() contains text { 'genomics', 'genomic', 'transcriptome', 'exome', 'whole genome', 'microarray', 'proteome', 'metabolome' }])

Is it possible to combine the two patterns (i.e., the selection criteria and the extraction in the return) into a single one?

Perhaps this is what ft:mark is supposed to do but I could not get it to work...

Best, Ron

On December 10, 2015 at 11:24:38 AM, Christian Grün (christian.gruen@gmail.com) wrote:

Hi Ron,

You can use ft:mark and ft:extract to highlights matches in a full-text result [1].

Hope this helps, Christian

[1] http://docs.basex.org/wiki/Full-Text_Module#ft:mark

On Thu, Dec 10, 2015 at 4:33 PM, Ron Katriel rkatriel@mdsol.com wrote:

...

Hi,

Is there a way to return all matches when searching a large XML structure? For example, return the genomic keywords that matched anywhere in $study using the following query:

for $study in db:open('CTGov')/clinical_study let $result := $study contains text { 'genomics', 'genomic', 'transcriptome', 'exome', 'whole genome', 'microarray', 'proteome', 'metabolome' } let score $score := $result where $score >= 0.01 return $study/id_info/nct_id (: this is just the Study ID :)

Ideally it would include an indication of where in the tree the matches are (e.g., that ‘exome’ was found in $study/official_title and in $article/keywords).

This could presumably be done using regular expression matching (after serializing the tree into a text string) but it does not seem an elegant solution.

Thanks, Ron

Christian Grün

11 Dec 11 Dec

4:37 a.m.

New subject: Returning text matches

Hi Ron,

...

Is it possible to combine the two patterns (i.e., the selection criteria and the extraction in the return) into a single one?

ft:extract works the same as ft:mark, but it additionally chops your results down to the relevant parts of the result.

Here are two ways how to shorten your query:

(: Variant 1 :) let $terms := ('genomics', 'genomic') for $study in db:open('CTGov')/clinical_study//* [text() contains text { $terms }] return ft:extract($study[text() contains text { $terms }])

(: Variant 2 :) let $terms := ('genomics', 'genomic') return ft:extract(db:open('CTGov')/clinical_study//* [text() contains text { $terms }])

Christian

...

On December 10, 2015 at 11:24:38 AM, Christian Grün (christian.gruen@gmail.com) wrote:

Hi Ron,

You can use ft:mark and ft:extract to highlights matches in a full-text result [1].

Hope this helps, Christian

[1] http://docs.basex.org/wiki/Full-Text_Module#ft:mark

On Thu, Dec 10, 2015 at 4:33 PM, Ron Katriel rkatriel@mdsol.com wrote:

...
Hi,

Is there a way to return all matches when searching a large XML structure? For example, return the genomic keywords that matched anywhere in $study using the following query:

for $study in db:open('CTGov')/clinical_study let $result := $study contains text { 'genomics', 'genomic', 'transcriptome', 'exome', 'whole genome', 'microarray', 'proteome', 'metabolome' } let score $score := $result where $score >= 0.01 return $study/id_info/nct_id (: this is just the Study ID :)

Ideally it would include an indication of where in the tree the matches are (e.g., that ‘exome’ was found in $study/official_title and in $article/keywords).

This could presumably be done using regular expression matching (after serializing the tree into a text string) but it does not seem an elegant solution.

Thanks, Ron

Liam R. E. Quin

10 Dec 10 Dec

3:36 p.m.

New subject: Returning text matches

On Thu, 2015-12-10 at 17:24 +0100, Christian Grün wrote:

...

Hi Ron,

You can use ft:mark and ft:extract to highlights matches in a full-text result [1].

And what happens if a full text match crosses an element boundary, e.g. a search for "blue socks" matching, He wore <sc>dark blue</sc> socks that day. could not return, He wore <sc>dark <match>blue</sc> socks</match> that day.

(Yes, I should test it, sorry! but the docs should probably mention it. it was a big part of the XPath/XQuery Full Text design early on)

Liam

-- Liam R. E. Quin liam@w3.org The World Wide Web Consortium (W3C)

Etanchaud Fabrice

11 Dec 11 Dec

3:28 a.m.

New subject: Returning text matches

Dear Liam,

I am afraid that full text index will not find "blue socks", because it does not cross text() node boundaries:

http://docs.basex.org/wiki/Full-Text#Mixed_Content

Best regards, Fabrice

-----Message d'origine----- De : basex-talk-bounces@mailman.uni-konstanz.de [mailto:basex-talk-bounces@mailman.uni-konstanz.de] De la part de Liam R. E. Quin Envoyé : jeudi 10 décembre 2015 21:37 À : Christian Grün christian.gruen@gmail.com; Ron Katriel rkatriel@mdsol.com Cc : basex-talk@mailman.uni-konstanz.de Objet : Re: [basex-talk] Returning text matches

On Thu, 2015-12-10 at 17:24 +0100, Christian Grün wrote:

...

Hi Ron,

You can use ft:mark and ft:extract to highlights matches in a full-text result [1].

(Yes, I should test it, sorry! but the docs should probably mention it. it was a big part of the XPath/XQuery Full Text design early on)

Liam

-- Liam R. E. Quin liam@w3.org The World Wide Web Consortium (W3C)

Christian Grün

4:41 a.m.

New subject: Returning text matches

...

I am afraid that full text index will not find "blue socks", because it does not cross text() node boundaries:

http://docs.basex.org/wiki/Full-Text#Mixed_Content

Exactly. You’ll need to do something like:

(: "... update () is used to transform the node to a "database node" (find more info in the Wiki) :)

for $xml in <xml> He wore <sc>dark blue</sc> socks that day. </xml> update () where $xml contains text 'blue socks' return ft:mark( $xml[.//text() contains text { 'blue', 'socks' }] )

...

-----Message d'origine----- De : basex-talk-bounces@mailman.uni-konstanz.de [mailto:basex-talk-bounces@mailman.uni-konstanz.de] De la part de Liam R. E. Quin Envoyé : jeudi 10 décembre 2015 21:37 À : Christian Grün christian.gruen@gmail.com; Ron Katriel rkatriel@mdsol.com Cc : basex-talk@mailman.uni-konstanz.de Objet : Re: [basex-talk] Returning text matches

On Thu, 2015-12-10 at 17:24 +0100, Christian Grün wrote:

...
Hi Ron,

You can use ft:mark and ft:extract to highlights matches in a full-text result [1].

And what happens if a full text match crosses an element boundary, e.g. a search for "blue socks" matching,

He wore <sc>dark blue</sc> socks that day. could not return, He wore <sc>dark <match>blue</sc> socks</match> that day.

(Yes, I should test it, sorry! but the docs should probably mention it. it was a big part of the XPath/XQuery Full Text design early on)

Liam

-- Liam R. E. Quin liam@w3.org The World Wide Web Consortium (W3C)

3541

Age (days ago)

3549

Last active (days ago)

basex-talk@mailman.uni-konstanz.de

8 comments

5 participants

tags (0)

participants (5)

buddyonweb-software＠yahoo.com
Christian Grün
Etanchaud Fabrice
Liam R. E. Quin
Ron Katriel