Hi,
I see that in BaseX 9.1.2 an expression such as "if (3) then 4 " does not raise an error, even if the "else" part is missing. Is this correct?
Ciao, Giuseppe
It is for BaseX, take a look at [1]. There is also an elvis operator on the same wiki page. Personally I don't like to deviate from specifications so I try to avoid it.
[1] - http://docs.basex.org/wiki/XQuery_Extensions#If_Without_Else
On 2/18/19 4:19 PM, Giuseppe G. A. Celano wrote:
Hi,
I see that in BaseX 9.1.2 an expression such as "if (3) then 4 " does not raise an error, even if the "else" part is missing. Is this correct?
Ciao, Giuseppe
Best,
George
Guten Tag, Love my basex, have been using it for years. I particularly love being able to have access to all the powerful 2.0 syntax. But I have a question about some behavior I am seeing in the following-sibling axis that does not seem logical to me.
Here is the data stored in basex:
xquery /text[@id='test']
<text id="test"> <clause> <word>A</word>a <word>B</word> <word>C</word>c </clause> </text>
Here is the query on that data:
xquery /text[@id='test']//word/concat(text(), ' ', normalize-space(./following-sibling::text()[1]))
A a B c C c
The same query run in https://www.freeformatter.com/xpath-tester.html https://www.freeformatter.com/xpath-tester.html in the same xml gives me the predictable result that I need: String='A a’ String='B ‘ String='C c'
Is there any way to alter my query to get the above result in basex? I imagine that this an xpath1.0 vs xpath2.0 issue.
Thanks, Mark Bordelon.
Am 26.02.2019 um 07:54 schrieb Mark Bordelon:
Guten Tag, Love my basex, have been using it for years. I particularly love being able to have access to all the powerful 2.0 syntax. But I have a question about some behavior I am seeing in the *following-sibling* axis that does not seem logical to me.
Here is the data stored in basex:
xquery /text[@id='test']
<text id="test"> <clause> <word>A</word>a <word>B</word> <word>C</word>c </clause> </text>
How did you store/parse it, in particular did you use any options to strip white space?
Here is the query on that data:
*xquery /text[@id='test']//word/concat(text(), ' ',
normalize-space(./following-sibling::text()[1]))* A a B *c* C c
The same query run in https://www.freeformatter.com/xpath-tester.html%C2%A0in the same xml gives me the predictable result that I need: String='A a’ String='B ‘ String='C c'
Is there any way to alter my query to get the above result in basex? I imagine that this an xpath1.0 vs xpath2.0 issue.
That path with a function call in the (last) step isn't allowed in XPath 1.0 at all.
Am 26.02.2019 um 07:54 schrieb Mark Bordelon:
But I have a question about some behavior I am seeing in the *following-sibling* axis that does not seem logical to me.
Here is the data stored in basex:
xquery /text[@id='test']
<text id="test"> <clause> <word>A</word>a <word>B</word> <word>C</word>c </clause> </text>
Here is the query on that data:
*xquery /text[@id='test']//word/concat(text(), ' ',
normalize-space(./following-sibling::text()[1]))* A a B *c* C c
I think the result you get is caused by whitespace chopping during XML parsing, seems to be the default, see http://docs.basex.org/wiki/Command-Line_Options
|-w| Toggles whitespace chopping of XML text nodes. By default, whitespaces will be chopped.
Hi Mark, Hi Martin,
yes Martin is right, the whitespace will be chopped by default leading to the observed behavior.
If you wanted to preserve whitespace globally, you can do that when creating your database. If you only want to preserve whitespace for a given element you may do this as well:
let $db := '<text id="test">
<clause xml:space="preserve"> <word>A</word>a <word>B</word> <word>C</word>c </clause> </text>' => parse-xml() (: have to actually parse it for the whitespace preserve to have a an effect :)
return $db//word/concat(text(), ' ', normalize-space(./following-sibling::text()[1]))
Which returns:
A a B C c
Best Michael
Am 26.02.2019 um 09:48 schrieb Martin Honnen martin.honnen@gmx.de:
I think the result you get is caused by whitespace chopping during XML parsing, seems to be the default, seehttp://docs.basex.org/wiki/Command-Line_Options http://docs.basex.org/wiki/Command-Line_Options -w Toggles whitespace chopping of XML text nodes. By default, whitespaces will be chopped.
Thanks Martin and Michael for the “sofortige Antoworten”. Forgive my lack of understanding about the particular of whitespace processing and how BASEX deals with them. To answer Martin’s questions: 1) I stored that xml (as I do all of my email) in the database using this commands in the basex shell: ADD TO PROSE.test test.xml after I created the file in vi on my desktop. I had no xml preamble or any processing instructions about the whitespace. 2) I am also querying in the basex using XQUERY.
In short,I need to get the result with the actual text node after the element I target, even if it is null, not the next element’s non-null text.
Given my storage, is adjusting the XPATH the way Kristian suggested the best way to achieve my goal? Concretely, how can I adjust how XQUERY executes the path directly to deal with the whitespace issue? I cannot use parse-xml(). Do I add the -w option to the XQUERY that I call from the basex client? And concretely, how can I store the XML with preamble or processing instruction about whitespace to achieve my result?
On Feb 26, 2019, at 05:50, Michael Seiferle ms@basex.org wrote:
Hi Mark, Hi Martin,
yes Martin is right, the whitespace will be chopped by default leading to the observed behavior.
If you wanted to preserve whitespace globally, you can do that when creating your database. If you only want to preserve whitespace for a given element you may do this as well:
let $db := '<text id="test">
<clause xml:space="preserve"> <word>A</word>a <word>B</word> <word>C</word>c </clause> </text>' => parse-xml() (: have to actually parse it for the whitespace preserve to have a an effect :)
return $db//word/concat(text(), ' ', normalize-space(./following-sibling::text()[1]))
Which returns:
A a B C c
Best Michael
Am 26.02.2019 um 09:48 schrieb Martin Honnen <martin.honnen@gmx.de mailto:martin.honnen@gmx.de>:
I think the result you get is caused by whitespace chopping during XML parsing, seems to be the default, seehttp://docs.basex.org/wiki/Command-Line_Options http://docs.basex.org/wiki/Command-Line_Options -w Toggles whitespace chopping of XML text nodes. By default, whitespaces will be chopped.
My follow-up was too hasty, and I forgot to mention that I had tried starting up basex with -w, but it break so many other things that I cannot use this GLOBAL approach. I need to instruct either at the XQUERY command level or at the xml file level. By the way, Kirstian, THANK YOU does work in the simplified data I gave for my original post. Unfortunately your solution does not work in more complicated examples of my datt. If you’re interested in what the real data looks like and the continuing issue, please hit me off-list.
Thanks gentlemen! Mark
On Feb 26, 2019, at 07:16, Mark Bordelon markcbordelon@yahoo.com wrote:
Thanks Martin and Michael for the “sofortige Antoworten”. Forgive my lack of understanding about the particular of whitespace processing and how BASEX deals with them. To answer Martin’s questions:
- I stored that xml (as I do all of my email) in the database using this commands in the basex shell:
ADD TO PROSE.test test.xml after I created the file in vi on my desktop. I had no xml preamble or any processing instructions about the whitespace. 2) I am also querying in the basex using XQUERY.
In short,I need to get the result with the actual text node after the element I target, even if it is null, not the next element’s non-null text.
Given my storage, is adjusting the XPATH the way Kristian suggested the best way to achieve my goal? Concretely, how can I adjust how XQUERY executes the path directly to deal with the whitespace issue? I cannot use parse-xml(). Do I add the -w option to the XQUERY that I call from the basex client? And concretely, how can I store the XML with preamble or processing instruction about whitespace to achieve my result?
On Feb 26, 2019, at 05:50, Michael Seiferle <ms@basex.org mailto:ms@basex.org> wrote:
Hi Mark, Hi Martin,
yes Martin is right, the whitespace will be chopped by default leading to the observed behavior.
If you wanted to preserve whitespace globally, you can do that when creating your database. If you only want to preserve whitespace for a given element you may do this as well:
let $db := '<text id="test">
<clause xml:space="preserve"> <word>A</word>a <word>B</word> <word>C</word>c </clause> </text>' => parse-xml() (: have to actually parse it for the whitespace preserve to have a an effect :)
return $db//word/concat(text(), ' ', normalize-space(./following-sibling::text()[1]))
Which returns:
A a B C c
Best Michael
Am 26.02.2019 um 09:48 schrieb Martin Honnen <martin.honnen@gmx.de mailto:martin.honnen@gmx.de>:
I think the result you get is caused by whitespace chopping during XML parsing, seems to be the default, seehttp://docs.basex.org/wiki/Command-Line_Options http://docs.basex.org/wiki/Command-Line_Options -w Toggles whitespace chopping of XML text nodes. By default, whitespaces will be chopped.
A follow-up: starting basex -w does NOT seem to solve completely my issue after all. Real data (more complicated than the simplified example) still does not query correctly: text nodes from after later elements are displayed in the place of null text nodes. I’ll try to get a better example, still simplified, that shows this.
On Feb 26, 2019, at 09:37, Mark Bordelon markcbordelon@yahoo.com wrote:
My follow-up was too hasty, and I forgot to mention that I had tried starting up basex with -w, but it break so many other things that I cannot use this GLOBAL approach. I need to instruct either at the XQUERY command level or at the xml file level. By the way, Kirstian, THANK YOU does work in the simplified data I gave for my original post. Unfortunately your solution does not work in more complicated examples of my datt. If you’re interested in what the real data looks like and the continuing issue, please hit me off-list.
Thanks gentlemen! Mark
On Feb 26, 2019, at 07:16, Mark Bordelon <markcbordelon@yahoo.com mailto:markcbordelon@yahoo.com> wrote:
Thanks Martin and Michael for the “sofortige Antoworten”. Forgive my lack of understanding about the particular of whitespace processing and how BASEX deals with them. To answer Martin’s questions:
- I stored that xml (as I do all of my email) in the database using this commands in the basex shell:
ADD TO PROSE.test test.xml after I created the file in vi on my desktop. I had no xml preamble or any processing instructions about the whitespace. 2) I am also querying in the basex using XQUERY.
In short,I need to get the result with the actual text node after the element I target, even if it is null, not the next element’s non-null text.
Given my storage, is adjusting the XPATH the way Kristian suggested the best way to achieve my goal? Concretely, how can I adjust how XQUERY executes the path directly to deal with the whitespace issue? I cannot use parse-xml(). Do I add the -w option to the XQUERY that I call from the basex client? And concretely, how can I store the XML with preamble or processing instruction about whitespace to achieve my result?
On Feb 26, 2019, at 05:50, Michael Seiferle <ms@basex.org mailto:ms@basex.org> wrote:
Hi Mark, Hi Martin,
yes Martin is right, the whitespace will be chopped by default leading to the observed behavior.
If you wanted to preserve whitespace globally, you can do that when creating your database. If you only want to preserve whitespace for a given element you may do this as well:
let $db := '<text id="test">
<clause xml:space="preserve"> <word>A</word>a <word>B</word> <word>C</word>c </clause> </text>' => parse-xml() (: have to actually parse it for the whitespace preserve to have a an effect :)
return $db//word/concat(text(), ' ', normalize-space(./following-sibling::text()[1]))
Which returns:
A a B C c
Best Michael
Am 26.02.2019 um 09:48 schrieb Martin Honnen <martin.honnen@gmx.de mailto:martin.honnen@gmx.de>:
I think the result you get is caused by whitespace chopping during XML parsing, seems to be the default, seehttp://docs.basex.org/wiki/Command-Line_Options http://docs.basex.org/wiki/Command-Line_Options -w Toggles whitespace chopping of XML text nodes. By default, whitespaces will be chopped.
Am 26.02.2019 um 18:52 schrieb Mark Bordelon:
A follow-up: starting basex -w does NOT seem to solve completely my issue after all. Real data (more complicated than the simplified example) still does not query correctly: text nodes from after later elements are displayed in the place of null text nodes. I’ll try to get a better example, still simplified, that shows this.
Just to make sure, if the data is already in the database and has been inserted with the default whitespace chopping turned on, the result you get for your sample is correct.
So at least in my understanding the only way to get the result you want is to make sure the original input XML is inserted again into the database, this time with chopping turned off.
Hi Mark,
as Martin already stated, the '-w‘-Option has to be active at import time, otherwise the whitespace will be chopped.
If I were to do it, I’d reindex all data and explicitly mark all elements that should preserve whitespace, if this is not an option I’d reindex all data with whitespace chopping set to off.
Looking forward to your example, I am sure we can figure this out :-)
Best Michael
Am 26.02.2019 um 18:52 schrieb Mark Bordelon markcbordelon@yahoo.com:
A follow-up: starting basex -w does NOT seem to solve completely my issue after all. Real data (more complicated than the simplified example) still does not query correctly: text nodes from after later elements are displayed in the place of null text nodes. I’ll try to get a better example, still simplified, that shows this.
Gentlemen (especially Michael)! My follow-up to my original question from a few days ago:
XML: <sent id="242"> <clause> <word lexmorph="ab|P|">A</word> <word lexmorph="to^tus|D|NsCbGn">toto</word>. -<word lexmorph="ab|P|">A</word> <word lexmorph="substantia|N|NsCb">substantia</word> </clause>.</sent>
XPATH: //clause[word and not(word[not(@lexmorph) or @lexmorph='' or contains(@lexmorph,' ')])]/string-join( word[not(@implicit)]/concat( text() ,'|' ,tokenize(@lexmorph, '|')[2] ,'|' ,normalize-space(./following-sibling::text()[1]) ),'~~~’)
executing this in https://www.freeformatter.com/xpath-tester.html#ad-output https://www.freeformatter.com/xpath-tester.html#ad-output returns the (correct) result: A|P|~~~toto|D|. -~~~A|P|~~~substantia|N|
executing using XQUERY in basex returns the (incorrect) result: A|P|. -~~~toto|D|. -~~~A|P|~~~substantia|N|
executing using XQUERY in basex adding Kristian's [. instance of text()] to the axis returns the (incorrect) result: A|P|. -~~~toto|D|. -~~~A|P|~~~substantia|N|
I have tried using the -w option’s true and false values, but my results are always as above.
Any ideas?
On Feb 27, 2019, at 01:59, Michael Seiferle ms@basex.org wrote:
Hi Mark,
as Martin already stated, the '-w‘-Option has to be active at import time, otherwise the whitespace will be chopped.
If I were to do it, I’d reindex all data and explicitly mark all elements that should preserve whitespace, if this is not an option I’d reindex all data with whitespace chopping set to off.
Looking forward to your example, I am sure we can figure this out :-)
Best Michael
Am 26.02.2019 um 18:52 schrieb Mark Bordelon <markcbordelon@yahoo.com mailto:markcbordelon@yahoo.com>:
A follow-up: starting basex -w does NOT seem to solve completely my issue after all. Real data (more complicated than the simplified example) still does not query correctly: text nodes from after later elements are displayed in the place of null text nodes. I’ll try to get a better example, still simplified, that shows this.
On Fri, 2019-03-01 at 13:50 -0800, Mark Bordelon wrote:
I have tried using the -w option’s true and false values, but my results are always as above.
Any ideas?
Try removing all whitespace between tags that's not part of the actual document and see if you get different results; if so, i'd suspect -w isn't working, perhaps?
Nothing wrong here with BaseX behavior.
If your xml import chops the whitespace, following-sibling::text()[1] will of course always match “.-“ for the first two word nodes.
Cheers, Daniel
Von: Mark Bordelon [mailto:markcbordelon@yahoo.com] Gesendet: Freitag, 1. März 2019 22:51 An: Michael Seiferle Cc: BaseX Betreff: Re: [basex-talk] following-sibling axis -- real data example
Gentlemen (especially Michael)! My follow-up to my original question from a few days ago:
XML: <sent id="242"> <clause> <word lexmorph="ab|P|">A</word> <word lexmorph="to^tus|D|NsCbGn">toto</word>. -<word lexmorph="ab|P|">A</word> <word lexmorph="substantia|N|NsCb">substantia</word> </clause>.</sent>
XPATH: //clause[word and not(word[not(@lexmorph) or @lexmorph='' or contains(@lexmorph,' ')])]/string-join( word[not(@implicit)]/concat( text() ,'|' ,tokenize(@lexmorph, '|')[2] ,'|' ,normalize-space(./following-sibling::text()[1]) ),'~~~’)
executing this in https://www.freeformatter.com/xpath-tester.html#ad-output returns the (correct) result: A|P|~~~toto|D|. -~~~A|P|~~~substantia|N|
executing using XQUERY in basex returns the (incorrect) result: A|P|. -~~~toto|D|. -~~~A|P|~~~substantia|N|
executing using XQUERY in basex adding Kristian's [. instance of text()] to the axis returns the (incorrect) result: A|P|. -~~~toto|D|. -~~~A|P|~~~substantia|N|
I have tried using the -w option’s true and false values, but my results are always as above.
Any ideas?
On Feb 27, 2019, at 01:59, Michael Seiferle <ms@basex.orgmailto:ms@basex.org> wrote:
Hi Mark,
as Martin already stated, the '-w‘-Option has to be active at import time, otherwise the whitespace will be chopped.
If I were to do it, I’d reindex all data and explicitly mark all elements that should preserve whitespace, if this is not an option I’d reindex all data with whitespace chopping set to off.
Looking forward to your example, I am sure we can figure this out :-)
Best Michael
Am 26.02.2019 um 18:52 schrieb Mark Bordelon <markcbordelon@yahoo.commailto:markcbordelon@yahoo.com>:
A follow-up: starting basex -w does NOT seem to solve completely my issue after all. Real data (more complicated than the simplified example) still does not query correctly: text nodes from after later elements are displayed in the place of null text nodes. I’ll try to get a better example, still simplified, that shows this.
basex-talk@mailman.uni-konstanz.de