Hello again,
I've noticed some inconsistencies with regard to the initial document in an empty database. I understand that it is often treated as an indication that the database is "empty". I.e., if the database has only one node and it's a document at the first pre value, then Data.empty() returns true. The problem arises because the database isn't actually empty - it contains one empty document - and some commands/methods view it as an empty database while others treat it as a database with one empty document. Several examples:
- The method Data.doc(name) will return -1 (indicating the document doesn't exist) while Data.docs() will return an array of size 1 with the first value a 0 (indicating there is one document at pre value 0). This is actually my own problem because the API I'm writing relies on these two methods being internally consistent - I can't have one telling me there is one document called "XYZ" but then have the other refuse to give me a pre value for the "XYZ" document I was just told exists. - The GUI shows a single doc node in the tree view but issuing the command "list [database name]" shows the database as having no resources. - I can evaluate the query "insert node 'text' into doc('[initial document name]')" and it executes. I can see the new text node under the initial document in the GUI tree view. Now when I issue the command "list [database name]" I see 1 resource. Having the list command report different numbers of resources before and after a XQuery insertion seems odd. - If I issue the command "ADD TO newdoc <xyz/>", I can see the new doc("newdoc") node with the child <xyz> element in the GUI but the previously empty document is now gone. This also seems odd - I'm not sure how I feel about an ADD command removing data from the database (even if it was just intended to be a placeholder).
In the end, I think BaseX should support the notion of a database with one empty document as being valid - there are probably cases where one might want to start their session in that state. I'm not sure what the solution is or should be (or if it's even something that is worth or needs solving). My own preference would be for the internal indication of an empty database to use a different node kind dedicated for that purpose so there is no confusion whether the single node at pre 0 is an empty initial document or an indication of an empty database.
At the very least, I think the inconsistencies above should be corrected - if a single empty document node is intended to signify an empty database and is not actually intended to be part of the database, then you should not see it in the GUI, not be allowed to insert content to it, etc.
Dave
Hi Dave,
sorry for the long delay, caused by the occidental winter break.. You've analyzed very well what are the inconsistencies between an "empty database" and a database with one empty document – and I know it's no real excuse that this issue hasn't caused any problems in our own use cases. The basic reason why we have equalized an empty document node with an "empty database" some time ago was that we wanted to avoid too many new code snippets that dealt with the special case of having stored no data at all.
In a nutshell, I completely agree that the solution you are proposing would be more consistent, but I suspect that we'd need quite some time to rework all the relevant lines of code. If you are interested in tackling this challenge, I'll be more than glad!
Best, Christian ___________________________
On Thu, Dec 29, 2011 at 4:36 PM, Dave Glick dglick@dracorp.com wrote:
Hello again,
I've noticed some inconsistencies with regard to the initial document in an empty database. I understand that it is often treated as an indication that the database is "empty". I.e., if the database has only one node and it's a document at the first pre value, then Data.empty() returns true. The problem arises because the database isn't actually empty - it contains one empty document - and some commands/methods view it as an empty database while others treat it as a database with one empty document. Several examples:
- The method Data.doc(name) will return -1 (indicating the document doesn't exist) while Data.docs() will return an array of size 1 with the first value a 0 (indicating there is one document at pre value 0). This is actually my own problem because the API I'm writing relies on these two methods being internally consistent - I can't have one telling me there is one document called "XYZ" but then have the other refuse to give me a pre value for the "XYZ" document I was just told exists.
- The GUI shows a single doc node in the tree view but issuing the command "list [database name]" shows the database as having no resources.
- I can evaluate the query "insert node 'text' into doc('[initial document name]')" and it executes. I can see the new text node under the initial document in the GUI tree view. Now when I issue the command "list [database name]" I see 1 resource. Having the list command report different numbers of resources before and after a XQuery insertion seems odd.
- If I issue the command "ADD TO newdoc <xyz/>", I can see the new doc("newdoc") node with the child <xyz> element in the GUI but the previously empty document is now gone. This also seems odd - I'm not sure how I feel about an ADD command removing data from the database (even if it was just intended to be a placeholder).
In the end, I think BaseX should support the notion of a database with one empty document as being valid - there are probably cases where one might want to start their session in that state. I'm not sure what the solution is or should be (or if it's even something that is worth or needs solving). My own preference would be for the internal indication of an empty database to use a different node kind dedicated for that purpose so there is no confusion whether the single node at pre 0 is an empty initial document or an indication of an empty database.
At the very least, I think the inconsistencies above should be corrected - if a single empty document node is intended to signify an empty database and is not actually intended to be part of the database, then you should not see it in the GUI, not be allowed to insert content to it, etc.
Dave
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Christian,
No problem - I had sent this right before leaving on vacation myself (I didn't want to totally forget everything while I was gone). Your explanation is pretty much what I figured. The current implementation works fine and makes sense in nearly every case - it's just the rare edge cases that the behavior seems odd or contradictory. I certainly don't mind taking a look at the problem and submitting a patch if I get it resolved - just wanted to ensure there weren't already plans in the pipeline to address this before I started hacking. Also, it might take me a little while - the "real" job is consuming a lot of my time right now. However, look for something from me on this in about a month...
Thanks,
Dave
-----Original Message----- From: Christian Grün [mailto:christian.gruen@gmail.com] Sent: Saturday, January 07, 2012 8:23 AM To: Dave Glick Cc: BaseX Subject: Re: [basex-talk] Empty Initial Document Inconsistencies
Hi Dave,
sorry for the long delay, caused by the occidental winter break.. You've analyzed very well what are the inconsistencies between an "empty database" and a database with one empty document - and I know it's no real excuse that this issue hasn't caused any problems in our own use cases. The basic reason why we have equalized an empty document node with an "empty database" some time ago was that we wanted to avoid too many new code snippets that dealt with the special case of having stored no data at all.
In a nutshell, I completely agree that the solution you are proposing would be more consistent, but I suspect that we'd need quite some time to rework all the relevant lines of code. If you are interested in tackling this challenge, I'll be more than glad!
Best, Christian ___________________________
On Thu, Dec 29, 2011 at 4:36 PM, Dave Glick dglick@dracorp.com wrote:
Hello again,
I've noticed some inconsistencies with regard to the initial document in an empty database. I understand that it is often treated as an indication that the database is "empty". I.e., if the database has only one node and it's a document at the first pre value, then Data.empty() returns true. The problem arises because the database isn't actually empty - it contains one empty document - and some commands/methods view it as an empty database while others treat it as a database with one empty document. Several examples:
- The method Data.doc(name) will return -1 (indicating the document doesn't exist) while Data.docs() will return an array of size 1 with the first value a 0 (indicating there is one document at pre value 0). This is actually my own problem because the API I'm writing relies on these two methods being internally consistent - I can't have one telling me there is one document called "XYZ" but then have the other refuse to give me a pre value for the "XYZ" document I was just told exists.
- The GUI shows a single doc node in the tree view but issuing the command "list [database name]" shows the database as having no resources.
- I can evaluate the query "insert node 'text' into doc('[initial document name]')" and it executes. I can see the new text node under the initial document in the GUI tree view. Now when I issue the command "list [database name]" I see 1 resource. Having the list command report different numbers of resources before and after a XQuery insertion seems odd.
- If I issue the command "ADD TO newdoc <xyz/>", I can see the new doc("newdoc") node with the child <xyz> element in the GUI but the previously empty document is now gone. This also seems odd - I'm not sure how I feel about an ADD command removing data from the database (even if it was just intended to be a placeholder).
In the end, I think BaseX should support the notion of a database with one empty document as being valid - there are probably cases where one might want to start their session in that state. I'm not sure what the solution is or should be (or if it's even something that is worth or needs solving). My own preference would be for the internal indication of an empty database to use a different node kind dedicated for that purpose so there is no confusion whether the single node at pre 0 is an empty initial document or an indication of an empty database.
At the very least, I think the inconsistencies above should be corrected - if a single empty document node is intended to signify an empty database and is not actually intended to be part of the database, then you should not see it in the GUI, not be allowed to insert content to it, etc.
Dave
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
All: Anyone has a little xquery / basex tip in getting rid of the extra white space that shows up at the beginning of each line after the first in the following xquery for $i in 1 to 10 return concat("x"," ") which in the UI returns x x x x x Just need a text output going to csv but this extra space is causing me headaches. tks *P
Hi Pascal,
you can either create a new, single string via the fn:string-join() function…
string-join(for $i in 1 to 10 return "x"," ")
…or use the "output:separator" option of BaseX:
declare option output:separator '\n'; (1 to 10) ! "x"
The second option, or a variation of it, may eventually find entrance into the official W3 XQuery Serialization Specification (see [1] for future details).
Hope this helps, Christian
[1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=16311
Anyone has a little xquery / basex tip in getting rid of the extra white space that shows up at the beginning of each line after the first in the following xquery for $i in 1 to 10 return concat("x"," ") which in the UI returns x x x x x Just need a text output going to csv but this extra space is causing me headaches. tks *P
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Hi Pascal,
if you use BaseX inside a shell script or as a command line xquery processing tool, you may also use the -L flag:
$ basex -h 2>&1 | grep L Usage: basex [-bcdiLosuvVwxz] [query] -L Append newlines to query results
$ basex -L 'for $i in 1 to 10 return $i' 1 2 3 ...
All the best, Alex
On 04.07.2012, at 04:24, Christian Grün wrote:
Hi Pascal,
you can either create a new, single string via the fn:string-join() function…
string-join(for $i in 1 to 10 return "x"," ")
…or use the "output:separator" option of BaseX:
declare option output:separator '\n'; (1 to 10) ! "x"
The second option, or a variation of it, may eventually find entrance into the official W3 XQuery Serialization Specification (see [1] for future details).
Hope this helps, Christian
[1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=16311
Anyone has a little xquery / basex tip in getting rid of the extra white space that shows up at the beginning of each line after the first in the following xquery for $i in 1 to 10 return concat("x"," ") which in the UI returns x x x x x Just need a text output going to csv but this extra space is causing me headaches. tks *P
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- Alexander Holupirek |-- http://www.informatik.uni-konstanz.de/~holupire |-- Database & Information Systems Group, U Konstanz `-- Room E 221, 0049 7531 88 2188 (phone) 3577 (fax)
Christian: option#2 is great, hope it find its way in the standard (add my vote to it) best *P
On 7/3/12 10:24 PM, Christian Grün wrote:
Hi Pascal,
you can either create a new, single string via the fn:string-join() function…
string-join(for $i in 1 to 10 return "x"," ")
…or use the "output:separator" option of BaseX:
declare option output:separator '\n'; (1 to 10) ! "x"
The second option, or a variation of it, may eventually find entrance into the official W3 XQuery Serialization Specification (see [1] for future details).
Hope this helps, Christian
[1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=16311
Anyone has a little xquery / basex tip in getting rid of the extra white space that shows up at the beginning of each line after the first in the following xquery for $i in 1 to 10 return concat("x"," ") which in the UI returns x x x x x Just need a text output going to csv but this extra space is causing me headaches. tks *P
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Hi Dave,
here's finally some public feedback for your outstanding feature request: I've rewritten our internal database structures, such that an empty database will now really be empty (i.e., contain no dummy document node anymore). As a result, some convenience methods have disappeared from the Data class (Data.isEmpty(), Data.single()), as they were only offered to hide the former inconsistency.
The changes have now been merged into the BaseX master branch. While I believe that we have tested the changes quite well, I am always glad for any feedback.
Christian ___________________________
On Thu, Dec 29, 2011 at 4:36 PM, Dave Glick dglick@dracorp.com wrote:
Hello again,
I've noticed some inconsistencies with regard to the initial document in an empty database. I understand that it is often treated as an indication that the database is "empty". I.e., if the database has only one node and it's a document at the first pre value, then Data.empty() returns true. The problem arises because the database isn't actually empty - it contains one empty document - and some commands/methods view it as an empty database while others treat it as a database with one empty document. Several examples:
- The method Data.doc(name) will return -1 (indicating the document doesn't exist) while Data.docs() will return an array of size 1 with the first value a 0 (indicating there is one document at pre value 0). This is actually my own problem because the API I'm writing relies on these two methods being internally consistent - I can't have one telling me there is one document called "XYZ" but then have the other refuse to give me a pre value for the "XYZ" document I was just told exists.
- The GUI shows a single doc node in the tree view but issuing the command "list [database name]" shows the database as having no resources.
- I can evaluate the query "insert node 'text' into doc('[initial document name]')" and it executes. I can see the new text node under the initial document in the GUI tree view. Now when I issue the command "list [database name]" I see 1 resource. Having the list command report different numbers of resources before and after a XQuery insertion seems odd.
- If I issue the command "ADD TO newdoc <xyz/>", I can see the new doc("newdoc") node with the child <xyz> element in the GUI but the previously empty document is now gone. This also seems odd - I'm not sure how I feel about an ADD command removing data from the database (even if it was just intended to be a placeholder).
In the end, I think BaseX should support the notion of a database with one empty document as being valid - there are probably cases where one might want to start their session in that state. I'm not sure what the solution is or should be (or if it's even something that is worth or needs solving). My own preference would be for the internal indication of an empty database to use a different node kind dedicated for that purpose so there is no confusion whether the single node at pre 0 is an empty initial document or an indication of an empty database.
At the very least, I think the inconsistencies above should be corrected - if a single empty document node is intended to signify an empty database and is not actually intended to be part of the database, then you should not see it in the GUI, not be allowed to insert content to it, etc.
Dave
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
basex-talk@mailman.uni-konstanz.de