On Mon, Nov 14, 2011 at 7:55 PM, Christian Grün <christian.gruen@gmail.com> wrote:

Hi Dave, hi all,

better Java APIs for BaseX - yes, that's a very relevant topic
nowadays, something that we've frequently been discussing for the last
weeks in our team. And the main challenge we are struggling with is
that there are just too many ways how such an API could look like -
and too many incoming requests that can hardly be bundled in one
single API.. Here are some of the requirements we're dealing with, and
the approaches that could be pursued (..and I already know which of
them you would prefer ;) :

* a new Command and Query/Result API could enhance/replace the
existing light-weight client Java API, and the representation of
results would be separated from the low-level data structures in
BaseX. This API could be used in the client/server architecture as
well, but it would introduce some overhead, as all the data structures
would have to be replicated by the client.

* The new Command, Query and Result objects could also be made
serializable. This way, they could be easily transfered over the
network, and there would be no need to develop custom binary
protocols.

* a real embedded API could ensure that developers do not suffer from
frequent changes in our query and storage backend. Instead, we would
ensure that the API does not change as long as the major version is
not updated. This API would be much more efficient than a
client/server API, but we might have to put more work into
transactional issues.

* the existing XML:DB and XQJ APIs could be revised and updated to
support the client/server architecture. This could reduce the need for
any other client/server-based API with a richer functionality.

Everyone who is interested in more powerful APIs.. Please speak out!
The more feedback we get, the better we'll be able to design our APIs.
And of course we're interested in volunteers out there... Last but not
least, this is an Open Source and community project ;)

@Dave: I've recently added a minimum query API for the QT3TS, Michael
Kay's new W3 XQuery Test Suite. Both the test suite driver and the
mini API (qt3api) is still work in progress:

https://github.com/BaseXdb/basex-tests/tree/master/src/main/java/org/basex/tests/w3c

It it not low-level enough to directly support any axis or update
operations; instead new QueryProcessor instances are created to
perform queries on intermediate nodes. It would be great if you could
have a look at this API, and it would then be interesting to know more
about your performance requirements: do you think that the overhead
for parsing and compiling query expressions (which usually does not
takes longer than some microseconds, and is often faster than the
actual axis traversals) will be too expensive in your scenario?

If you believe that this framework would be sufficient, we could start
to enhance it, make it safe for concurrent access, document it, etc.
If you need to work with the PRE and ID values of database nodes,
e.g., you could take advantage of the db: functions of BaseX [1]:

Output: db:node-id($node)
Input: db:open-id($db, $id)

Hope this helps,
Christian

[1] http://docs.basex.org/wiki/Database_Functions

On Mon, Nov 14, 2011 at 6:26 PM, Dave Glick <dglick@dracorp.com> wrote:
> Hi all,
>
> We’ve been using BaseX for several years now and have constantly been
> skirting around our primary use case: using BaseX in an embedded mode. What
> I mean by this is using BaseX in-process in an application without running
> any kind of client/server communication bridge and with very direct access
> to BaseX primitives. There are several reasons for wanting to do this
> including performance (which seems to be the subject of recent discussions,
> I.e., running the server in “local” mode). My own primary reason is to gain
> more direct access to the database objects. For example, we routinely have a
> need to:
>
> - Directly access and traverse database nodes by climbing, descending,
> following, etc.
>
> - Insert or remove content at a specific database node
>
> - Store references to individual nodes (I.e., using its “pre” and “index”
> value)
>
> - Fine-tune queries in order to set context, external functions, etc.
>
> While many of these operations can indeed be performed through the existing
> client/server interface, it’s less friendly – especially when doing things
> like asking for the next sibling of a given node. With a direct embedded API
> you just get the next node, bypassing the XQuery processor altogether. From
> my current work in this area, I think BaseX is already “primed” for this
> kind of API – 90% or more of the code is already in place since most of the
> primitives already expose common methods for use by database commands,
> XQuery processor, etc. All that should be needed is to expose this
> functionality in a stable and complete API.
>
> Good examples of applications that may need this kind of API include media
> players (I.e., for storage of the media library data), simple stand-alone
> database applications, etc. Until recently, we’ve been able to adapt BaseX
> to fit our needs by writing a thin wrapper layer that interfaces with the
> appropriate BaseX classes. However, with the rapid pace of BaseX development
> these days it’s becoming increasingly difficult to track each release since
> we rely on aspects of the BaseX codebase that are not really intended for
> public consumption and thus keep changing. This brings up a couple
> questions:
>
> - Are we the only ones interested in a direct embedded interface?
>
> - Does the BaseX team have any plans to implement such an interface?
>
> - Would such an interface be better implemented by the BaseX team (as
> opposed to a third party)?
>
> I don’t mind doing some work in this area, however, I have some concerns
> about doing so. Primarily, given that the whole idea would be to make direct
> integration easier and more stable it seems like the structure and layout of
> the classes in the embedded API and the ways that they interact with the
> underlying BaseX objects should probably be determined by the BaseX team.
> The danger is that someone outside the team spends effort creating such an
> interface only to do things in a way that’s either not preferred or
> difficult to maintain as the core team continues to improve the overall
> product.
>
> Hopefully this was clear... Thoughts?
>
> Dave