Hi Erik,
all of your suggestions are more than welcome – this is why I posted your answer back to the mailing list. I invite all of you to give personal feedback!
A few questions for you about basex and plans...
- Can a database store files other than xml? xqy/xql, images, js, css, etc?
Our storage is currently limited to well-formed XML documents.
- Any plans to add collections or may basex does and I've overlooked it?
Yes, BaseX supports collections – as long as they are flat. Databases can store multiple single documents, which can be accessed e.g. by the XQuery collection() function. See also Section 4.4 in http://basex.org/faq.
- ReST interface: There are two approaches, both have their benefits and I would like to see both available:
a) The rest interface allows manipulating db resources: dbs, collections, files, users, permissions, etc. (The eXist db does this.) -The main advantage with this is an API easy to integrate into many environments b) The rest interface allows requests to be rewritten to a delegate xquery file to handle processing the http methods. (Some other db's provide this.) -This allows building complete applications within a database. -Would like to see this as the base approach with (a) as an internal implementation of (b) What are your thoughts?
Interesting ideas.. Some of our current interface features are be inspired by eXist. While GET and POST can be used to send XQuery/Update requests, PUT and DELETE will allow to modify database instances. Here is a simple example how a GET request could look like:
http://localhost:8080/rest/db/factbook?query=//country
Additionally, we will allow to send BaseX database commands via GET and POST..
http://localhost:8080/rest/db/factbook?command=optimize
This way, user management can be handled via REST as well. I'm not sure how users are modified in eXist via REST? Which other XML/REST implementations have you worked with?
- Any plans for webdav? This makes moving files in and out of the db easy for non-technical users. I've briefly looked at doing this with milton, apache jackrabbit, and from scratch.
- What do you think about using OSGI to enable plugging in db services? Bundles would be something like dbServer (the core service), tcpServer, restAdminServer, restServer, webdavServer, webAdmin, cliAdmin, monitorServer clusterServer, etc. This keeps the dbServer core code clean and small, yet allows other services to be developed and evolve independently. It also allows spinning up multiple instances of a server. For example, I could spin up multiple restServers or tcpServers on different ports to segment applications, or multiple webdav servers to partition content management per user/group rights.
- What do you think about moving to github? Looking for ways to grow basex community more quickly.
Good suggestions.. Let's see what everyone thinks about it!
Hope you don't mind me jumping in. I see a great product whose continued and expanded development is important to this industry.
You are welcome, Christian
___________________________
Christian Gruen Universitaet Konstanz Department of Computer & Information Science D-78457 Konstanz, Germany Tel: +49 (0)7531/88-4449, Fax: +49 (0)7531/88-3577 http://www.inf.uni-konstanz.de/~gruen
See below... Erik
On Feb 4, 2010, at 9:46 AM, Christian Grün wrote:
Hi Erik,
all of your suggestions are more than welcome – this is why I posted your answer back to the mailing list. I invite all of you to give personal feedback!
A few questions for you about basex and plans...
- Can a database store files other than xml? xqy/xql, images, js, css, etc?
Our storage is currently limited to well-formed XML documents.
Ok, we'll need to think about how to handle packaging apps.
- Any plans to add collections or may basex does and I've overlooked it?
Yes, BaseX supports collections – as long as they are flat. Databases can store multiple single documents, which can be accessed e.g. by the XQuery collection() function. See also Section 4.4 in http://basex.org/faq.
Flat meaning one collection of multiple docs per database right? Just thinking about how that impacts partitioning data...more databases...consistent naming scheme, join queries, etc. I'd like to hear how valuable other think collection hierarchies would be? What's the impact on db design?
- ReST interface: There are two approaches, both have their benefits and I would like to see both available:
a) The rest interface allows manipulating db resources: dbs, collections, files, users, permissions, etc. (The eXist db does this.) -The main advantage with this is an API easy to integrate into many environments b) The rest interface allows requests to be rewritten to a delegate xquery file to handle processing the http methods. (Some other db's provide this.) -This allows building complete applications within a database. -Would like to see this as the base approach with (a) as an internal implementation of (b) What are your thoughts?
Interesting ideas.. Some of our current interface features are be inspired by eXist. While GET and POST can be used to send XQuery/Update requests, PUT and DELETE will allow to modify database instances. Here is a simple example how a GET request could look like:
http://localhost:8080/rest/db/factbook?query=//country
Additionally, we will allow to send BaseX database commands via GET and POST..
http://localhost:8080/rest/db/factbook?command=optimize
This way, user management can be handled via REST as well. I'm not sure how users are modified in eXist via REST? Which other XML/REST implementations have you worked with?
I've worked with eXist and Mark Logic. I recommend creating a side discussion with a document to collect feedback and review decisions. For example, how ReST purist should we be? GET methods don't modify resources for example. Do we enforce that in the BaseX rest API? How does an application respond with different content (xml, json, other)?
The application could be expected to do that or we could provide transformation options in the ReST service. I really like keeping the db clean and simple. But some services pushed down "near" the database can really enable projects. I think this is where osgi could benefit. Providing services in the same jvm as the database, yet independently developed and deployed.
- Any plans for webdav? This makes moving files in and out of the db easy for non-technical users. I've briefly looked at doing this with milton, apache jackrabbit, and from scratch.
- What do you think about using OSGI to enable plugging in db services? Bundles would be something like dbServer (the core service), tcpServer, restAdminServer, restServer, webdavServer, webAdmin, cliAdmin, monitorServer clusterServer, etc. This keeps the dbServer core code clean and small, yet allows other services to be developed and evolve independently. It also allows spinning up multiple instances of a server. For example, I could spin up multiple restServers or tcpServers on different ports to segment applications, or multiple webdav servers to partition content management per user/group rights.
- What do you think about moving to github? Looking for ways to grow basex community more quickly.
Good suggestions.. Let's see what everyone thinks about it!
Hope you don't mind me jumping in. I see a great product whose continued and expanded development is important to this industry.
You are welcome, Christian
Christian Gruen Universitaet Konstanz Department of Computer & Information Science D-78457 Konstanz, Germany Tel: +49 (0)7531/88-4449, Fax: +49 (0)7531/88-3577 http://www.inf.uni-konstanz.de/~gruen
Erik, sorry for a late feedback..
- Any plans to add collections or may basex does and I've overlooked it?
[...]
Flat meaning one collection of multiple docs per database right?
Exactly.. Traditionally, we put our major focus on single, large documents. That's why we decided to keep collections as simple as possible (at least for now). Hierarchical collections could probably implemented rather straight-forward in the backend; conceptual issues would take some more time, however.
- ReST interface: There are two approaches, both have their benefits and I would like to see both available:
[...]
The first beta version of our REST implementation will be released around the end of this month! This might be a good base to discuss further enhancements and changes. We're currently experimenting with two versions, both using Jetty and one of them using JAX-RS.
For example, how ReST purist should we be? GET methods don't modify resources for example. Do we enforce that in the BaseX rest API? How does an application respond with different content (xml, json, other)?
As, in future, XQuery Update might be merged with main XQuery specification, we recently decided to allow XQuery Update via GET and POST. DELETE and PUT will be used to delete/insert collections or documents within collections.
All the best, Christian
Hi Christian,
First off, thanks for building and continuing to maintain what I think is one of the most underrated/undiscovered pieces of open source code out there. Now to my question...
We've been using BaseX 5.7 with very heavy use of the backend for a while now (though we've tried hard not to actually change the BaseX code in our use). On 5.7 we use the "insert" methods of the org.basex.data.Data class to enable mutability (I.e., to add new attributes or elements at specific locations in the document). We prototyped our code from the "insert" proc file. I've noticed that both the insert methods we had been using and the whole "insert" proc command are gone in 6.0. Does 6.0 no longer support mutability? I do notice there are some other insert/update methods instead, but no good calls to them to try and understand how they work. What would be a good procedure for inserting new data given that we have both a Nod and Data reference?
Thanks,
Dave
Dear Dave,
First off, thanks for building and continuing to maintain what I think is one of the most underrated/undiscovered pieces of open source code out there. Now to my question...
Thanks for the feedback; don't hesitate to pass this on to everyone…
We've been using BaseX 5.7 with very heavy use of the backend for a while now (though we've tried hard not to actually change the BaseX code in our use). On 5.7 we use the "insert" methods of the org.basex.data.Data class to enable mutability (I.e., to add new attributes or elements at specific locations in the document). We prototyped our code from the "insert" proc file. I've noticed that both the insert methods we had been using and the whole "insert" proc command are gone in 6.0. Does 6.0 no longer support mutability? I do notice there are some other insert/update methods instead, but no good calls to them to try and understand how they work. What would be a good procedure for inserting new data given that we have both a Nod and Data reference?
As XQuery Update is supported since BaseX 6.0, all update methods in the Data class have been completely revamped and aligned with the logics of the official language specification. If you want to have more insight on how single update operations are working, I recommend you to check out the org.basex.query.up.primitives package. For example, the method InsertAttribute will show you how to add new attributes into a DBNode instance.
Hope this helps, Christian
___________________________
Christian Gruen Universitaet Konstanz Department of Computer & Information Science D-78457 Konstanz, Germany Tel: +49 (0)7531/88-4449, Fax: +49 (0)7531/88-3577 http://www.inf.uni-konstanz.de/~gruen
I have two (hopefully) quick questions about when indexes get rebuilt and when they need to be manually rebuilt.
I've noticed that in the ACreate and CreateDB classes, indexes like the full text index are explicitly rebuilt when a new Data instance is created. If one were to directly create a new Data instance using a Builder, would those indexes need to be built after creation and parsing the way they are in the two mentioned classes?
Secondly, I can't find code that updates the indexes on changes to the database. I would assume that performing updates (either through XQuery Update or directly) such as adding a new element should update the corresponding indexes. Where does this happen? Is it incremental (that is, only those parts of the index that need to be changed are modified) or full (all the indexes are totally rebuilt on every change)?
Thanks,
Dave
Hi Dave,
I've noticed that in the ACreate and CreateDB classes, indexes like the full text index are explicitly rebuilt when a new Data instance is created. If one were to directly create a new Data instance using a Builder, would those indexes need to be built after creation and parsing the way they are in the two mentioned classes?
All indexes are optional in BaseX, and mainly applied for speeding up query execution – so you shouldn't encounter any troubles if they are missing. By default, the path summary, text and attribute indexes are created (as Prop.TEXTINDEX, ATTRINDEX and PATHINDEX are set to true). You can create all of them in a second step (e.g. by running an instance of the CreateIndex class).
Secondly, I can't find code that updates the indexes on changes to the database. I would assume that performing updates (either through XQuery Update or directly) such as adding a new element should update the corresponding indexes. Where does this happen? Is it incremental (that is, only those parts of the index that need to be changed are modified) or full (all the indexes are totally rebuilt on every change)?
Your observation is correct – there are no methods for incremental index updates in BaseX. The main reason for dismissing the index structures after updates is that we wanted to keep index structures as small and fast as possible. This means that indexes have to be rebuilt after updates by e.g. triggering the OPTIMIZE command (see also http://basex.org/xquery).
Hope this helps - more questions are welcome, Christian
___________________________
Christian Gruen Universitaet Konstanz Department of Computer & Information Science D-78457 Konstanz, Germany Tel: +49 (0)7531/88-4449, Fax: +49 (0)7531/88-3577 http://www.inf.uni-konstanz.de/~gruen
basex-talk@mailman.uni-konstanz.de