Hi All,
I can create a database via the GUI, but if I use db:create [1] I get the message "out of main memory": why? Thanks!
db:create("myDB",
"sourceDirectory",
"destinationDirectory",
map{"ftindex": true(), "language": false()}
)
Best,
Giuseppe
Welcome! I have not used exist, but I can share my history. I used sedna for many years. When development stopped I looked for alternatives. I have read many comparisons in WWW andI have found basex.
A detailed benchmark of basex and others: https://bbddxml.wordpress.com/2014/01/08/un-benchmark-para-nuestras-bases-d…. If you know xmark, it is most important and most cited benchmark for xml. I have run these tests also (for Sedna and basex), and published results are close.
In basex I notice that writing xml result to files is very fast. Also query and indexes optimisation is very clever (pues, more clever than me), sometimes only few miliseconds for gigabytes of xml. And basex has excelent support for official w3C standards (Fulltext, Update, all of Xquery 3.1. see https://en.wikipedia.org/wiki/XML_database). And very stable.
I have used berkeley in early times, it was very slow.
One question to the team: Can you include replication feature?
Best regards Julio
Hi,
I have changed the code for RbaseXClient.R in such a way that it
complies more to the server protocol and to the R-style of coding.
- 'execute' has been renamed to 'command'
- All the functions in the Command Protocol section return a list with
the complete server response. This makes it much easier to handle errors
(see lines 8-14 in TestRBaseXClient.R).
In the Query Command Protocol section, the functions 'more()' (as being
called by results) and 'full()' now return the results, prefixed with a
byte that represents the Type ID.
Cheers,
Ben Engbers
Hi
Is anyone aware of any comparisons between baseX and Exist?
I have some familiarity with Exist and I’d like o understand what are the benefits of each.
Thanks
Feargal
>
> On Thu, 2018-04-19 at 16:26 +0100, Feargal Hogan wrote:
>>>
>> From the comparison chart that Ben referenced earlier I noticed that
>> baseX doesn?t seem to actually load xml files into an xml database,
>> is that right?
> No. Yes. Maybe.
>
> baseX does load the documents into a database. It stores them in an
> internal data structure, not as textual XML.
I think the comparison page perhaps misrepresents that point slightly.
>
>
>> Does baseX need to be 'told' that it has been updated, in order to
>> add the new data to its indeices?
>> Or does it know there has been an update and automatically reindex?
>
> This isn't a meaningful quesiton.
You are correct.
In the context of my misunderstanding the comparison document, the question is pointless and meaningless.
Thanks
Hello all,
I’m currently in the process of evaluating BaseX for an embedded use case and I ran into an issue. I figured out that each Command is a transaction (though I think this should be made more prominent in the javadocs at http://docs.basex.org/javadoc/org/basex/core/Command.html) and I found the Execute command that lets multiple Commands be grouped together and executed as one… but the Execute command takes a string. What happens if I have a series of Command objects that I would like to execute together in a single transaction?
Dear list
I am curious to find out whether it is possible to stream results of a query, i.e. where a submitted query behaves as a generator. Especially for web interfaces this is useful: a user is presented with a result as soon as it is found, and doesn't have to wait for all results to be gathered. Especially useful with large databases of course.
Preferably a way that works with the nodejs, PHP client but I am open to other suggestions.
Thanks in advance
Bram
Hi all,
I have loaded a large XML document (a dictionary of New Testament
Greek) into its own database in BaseX 9.0. I'm using OpenJDK 1.8.0_162
under Ubuntu 16.04.4. I used the default Java parser, and I enabled
the token & full text indices, but otherwise the database settings
were the defaults.
When I execute a particular query, I'm getting a sequence of the
character reference for the carriage return ('
') instead of some
characters like the single & double daggers.
Here is some typical output:
<!--Page: 4 ; Entry: ἄγε|G33 -->
<entry n="ἄγε|G33">
<note type="occurrencesNT">2</note>
<form>
<orth>ἄγε</orth>,</form>
<seg type="derivation">prop. imperat. of<ref>
<foreign xml:lang="grc">ἄγω</foreign>
</ref>,</seg>
<sense>
<gloss>come!</gloss>used as<gramGrp>
<pos>adv.</pos>
</gramGrp>and addressed, like<ref>
<foreign xml:lang="grc">φέρε</foreign>
</ref>, to one or more persons:<ref osisRef="Jas.4.13">Ja
4:13</ref>,<ref
osisRef="Jas.5.1">5:1</ref>











</sense>
</entry>
The sequence of '
' char refs replace a dagger '†' (U+2020). I
wonder if I'm doing something wrong, or if I've happened on a bug.
Here is my XQuery:
declare namespace tei = "http://www.crosswire.org/2013/TEIOSIS/namespace";
declare namespace xsi = "http://www.w3.org/2001/XMLSchema-instance";
let $coll := collection('abbott-smith.tei'),
$elems := $coll//tei:seg[parent::tei:entry]
return <div
xmlns="http://www.crosswire.org/2013/TEIOSIS/namespace"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">{
for $elem in $elems
let $entry := $elem/parent::tei:entry,
$entry_lemma := $entry/@n/data(),
$page_num := $elem/preceding::tei:pb[1]/@n/data(),
$comment_text := concat('Page: ', $page_num, ' ; Entry: ',
$entry_lemma, ' ')
return (
comment {$comment_text},
$entry
)
}</div>
The source XML document is here:
https://github.com/translatable-exegetical-tools/Abbott-Smith/blob/master/a…
Thanks in advance for your guidance.
All the best,
Charles Bearden
Hello,
I am successfully using BaseX Validation Module.
Along the following lines:
let $xml := 'd:\Temp\CDW\HOME\id4879_BO201801_HomeSubscriberMovementFact.xml'
let $xsd := 'd:\Temp\CDW\HOME\HomeSubscriberMovementFact.xsd'
return validate:xsd-report($xml, $xsd, '1.1')
My XML files have multi-megabyte size and lots of validation errors. In tens or hundreds of thousands of errors.
Behind the scenes, Saxon validator 9.8.0.11 is running.
Unfortunately, the output structure contains a repeating url attribute.
The BaseX output pane cannot present all the errors.
It says: "(Chopped) Results".
<report>
<status>invalid</status>
<message level="Error" line="10" column="26" url="file:///D:/Temp/CDW/HOME/id4879_BO201801_HomeSubscriberMovementFact.xml">The content "N/A" of element <CommercialServiceCode> does not match the required simple type. Value "N/A" contravenes the enumeration facet "R60080-X00162, R60080-X00163, ..." of the type Q{http://www.millicom.com}CommercialServiceCodeType</message>
<message level="Error" line="19" column="23" url="file:///D:/Temp/CDW/HOME/id4879_BO201801_HomeSubscriberMovementFact.xml">The content "TBD" of element <MovementTechnology> does not match the required simple type. Value "TBD" contravenes the enumeration facet "N/A, HFC, GPON, MMDS, FIBER, C..." of the type Q{http://www.millicom.com}MovementTechnologyType</message>
<message level="Error" line="24" column="18" url="file:///D:/Temp/CDW/HOME/id4879_BO201801_HomeSubscriberMovementFact.xml">The content "-1.0000" of element <DownloadSpeed> does not match the required simple type. Value "-1" contravenes the minExclusive facet "0" of the type Q{http://www.millicom.com}DownloadSpeedType</message>
<message level="Error" line="26" column="6" url="file:///D:/Temp/CDW/HOME/id4879_BO201801_HomeSubscriberMovementFact.xml">The 7th field in constraint {PK} has no value</message>
...
</report>
My proposal is to eliminate the repeated url attribute from the each message and elevate it to its own element just once under the root report tag.
Along the following output structure:
<report>
<status>invalid</status>
<url>file:///D:/Temp/CDW/HOME/id4879_BO201801_HomeSubscriberMovementFact.xml</url>
<message level="Error" line="10" column="26">The content "N/A" of element <CommercialServiceCode> does not match the required simple type. Value "N/A" contravenes the enumeration facet "R60080-X00162, R60080-X00163, ..." of the type Q{http://www.millicom.com}CommercialServiceCodeType</message>
<message level="Error" line="19" column="23">The content "TBD" of element <MovementTechnology> does not match the required simple type. Value "TBD" contravenes the enumeration facet "N/A, HFC, GPON, MMDS, FIBER, C..." of the type Q{http://www.millicom.com}MovementTechnologyType</message>
<message level="Error" line="24" column="18">The content "-1.0000" of element <DownloadSpeed> does not match the required simple type. Value "-1" contravenes the minExclusive facet "0" of the type Q{http://www.millicom.com}DownloadSpeedType</message>
<message level="Error" line="26" column="6">The 7th field in constraint {PK} has no value</message>
...
</report>
This way the output of the validation is much more readable and hopefully will fit in its entirety to the output pane.
Regards,
Yitzhak Khabinsky
Technical Services Lead
Millicom International Services LLC
396 Alhambra Circle, Suite 1100
Coral Gables, FL 33134
Skype4B: +1 (305) 445-4172
Tel: (954) 684-8673
yitzhak.khabinsky(a)millicom.com<mailto:.khabinsky@millicom.com>
www.millicom.com<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.millico…>
Hi Marco,
The proposed solution is way too slow.
let $xml := 'd:\Temp\CDW\HOME\id4879_BO201801_HomeSubscriberMovementFact.xml'
let $xsd := 'd:\Temp\CDW\HOME\HomeSubscriberMovementFact.xsd'
let $validate := validate:xsd-report($xml, $xsd, '1.1')
return file:write("output.xml",
copy $newvalidate := $validate
modify (delete node $newvalidate//@url)
return $newvalidate
)
I guess delete node... is too heavy.
It runs for 143 seconds.
Without it just 11 seconds.
The input XML file size is about 40MB.
The output.xml file has about the same size.
That's why I was proposing to change the default output format of the validation.
Regards,
Yitzhak Khabinsky
Technical Services Lead
Millicom International Services LLC
396 Alhambra Circle, Suite 1100
Coral Gables, FL 33134
Skype4B: +1 (305) 445-4172
Tel: (954) 684-8673
yitzhak.khabinsky(a)millicom.com<mailto:.khabinsky@millicom.com>
www.millicom.com<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.millico…>
From: Yitzhak Khabinsky
Sent: Thursday, April 19, 2018 2:51 PM
To: m.lettere(a)gmail.com
Subject: Re: [basex-talk] Validation Module: validate:xsd-report( ) improvement
Hi Marco,
Thanks for the proposed solution.
It works.
But I was referring to the default behavior.
The url attribute is redundant for every message element.
Regards,
Yitzhak Khabinsky
Technical Services Lead
Millicom International Services LLC
396 Alhambra Circle, Suite 1100
Coral Gables, FL 33134
Skype4B: +1 (305) 445-4172
Tel: (954) 684-8673
yitzhak.khabinsky(a)millicom.com<mailto:.khabinsky@millicom.com>
www.millicom.com<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.millico…>