BaseX steps for Calabash?

List overview All Threads
Download

newer

older

Problem in inserting xml...

SET textindex / attrindex switched...

Florent Georges

27 Aug 2011 27 Aug '11

8:25 a.m.

Hi,

Has anyone ever thought about writing extension steps for Calabash, in order to evaluate queries using BaseX? Using either the standalone processor, or to send queries to the server.

Regards,

-- Florent Georges http://fgeorges.org/ http://h2oconsulting.be/

Show replies by date

Christian Grün

29 Aug 29 Aug

6:36 p.m.

Hi Florent,

yes, it would be great to have Calabash interconnected with BaseX, as more and more of our users are asking for XProc support. I'm not aware of any current endeavors in this direction, and, unfortunately, our core team has no resources left to put more time into this.

Personally, I'm too poorly informed to assess how much time would have to be spent in the extension, so any community feedback on this is more than welcome.

Best, Christian _______________________

On Sat, Aug 27, 2011 at 2:25 PM, Florent Georges lists@fgeorges.org wrote:

...

Hi,

Has anyone ever thought about writing extension steps for Calabash, in order to evaluate queries using BaseX? Using either the standalone processor, or to send queries to the server.

Regards,

-- Florent Georges http://fgeorges.org/ http://h2oconsulting.be/ _______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

Florent Georges

30 Aug 30 Aug

5:18 a.m.

Christian Grün wrote:

Hi,

...

Personally, I'm too poorly informed to assess how much time would have to be spent in the extension, so any community feedback on this is more than welcome.

I guess a proof-of-concept, ad-hoc extension can be written in one day, or even less. But having an industry-ready one is another kind of beast (writing tests, designing the interface, thinking about install and deployment, etc.)

For what is worth noting, here is what I am using for now. An XSLT step is generating the XQuery (as a mix of text and element nodes). This is then escaped to serialize elements, then saved to the filesystem, then passed to BaseX (assuming a script called 'basex' is in the PATH):

<p:xslt name="compile"> ... </p:xslt>

<p:escape-markup name="escape"/>

<p:store method="text"> <p:with-option name="href" select="$compiled-file"/> </p:store>

<p:exec command="basex" name="run"> <p:with-option name="args" select="$compiled-file"/> <p:input port="source"> <p:empty/> </p:input> </p:exec>

If you don't want to impose a script 'basex' to be in the PATH, you can call Java directly:

<p:exec command="java" name="run"> <p:with-option name="args" select=" string-join( ('-cp', $basex-jar, 'org.basex.BaseX', $compiled-file), ' ')"/> <p:input port="source"> <p:empty/> </p:input> </p:exec>

But you then need to get the path to the BaseX JAR somehow. E.g. passing it as an option to the pipeline:

<p:pipeline ...> <p:option name="basex-jar" required="true"/>

Both methods have the same problem: they have to save the query on the filesystem. It would be yet a huge improvement to be able to pass the query some way without saving it on disk. Which I guess would be fairly easy in an extension step in Java, for someone who knows the BaseX API ;-)

But the real challenge is probably more to design a clean and usable interface (especially dealing with the differences between the standalone processor, the server, the client, etc.)

For info, see the XSpec test harness I am writing for BaseX in attachement.

Hope that helps, regards,

-- Florent Georges http://fgeorges.org/ http://h2oconsulting.be/

Christian Grün

5:22 p.m.

...

I guess a proof-of-concept, ad-hoc extension can be written in one day, or even less. But having an industry-ready one is another kind of beast (writing tests, designing the interface, thinking about install and deployment, etc.) [...] For info, see the XSpec test harness I am writing for BaseX in attachement.

Great to hear that you've already spent lots of thoughts on this! In fact we'll need someone who has the endurance to tackle all the usual nuts and bolts, but on the other hand we can benefit a lot from first prototypes. Maybe someone on the list who's working with XProc is interested to have a look at the test harness?

Just as a side note: our XQJ API might as well be one possible interface to connect XProc with BaseX (although we advise most users to apply our own APIs instead..).

...

Hope that helps, regards,

It sure helps! Christian

Florent Georges

4 Sep 4 Sep

5:04 p.m.

Christian Grün wrote:

Hi,

...

...
I guess a proof-of-concept, ad-hoc extension can be written in one day, or even less. But having an industry-ready one is another kind of beast (writing tests, designing the interface, thinking about install and deployment, etc.)

...

...
For info, see the XSpec test harness I am writing for BaseX in attachement.

...

Great to hear that you've already spent lots of thoughts on this! In fact we'll need someone who has the endurance to tackle all the usual nuts and bolts, but on the other hand we can benefit a lot from first prototypes.

I've just published a blog post at [1] about writing an extension step in Java for Calabash to use BaseX. Using BaseX is just an excuse, and the post shows more how to write an extension and how to glue the various parts together, but I think it is enough to give the big picture.

Calabash contains also extension steps for MarkLogic, which can be of some inspiration to design other processors steps, see http://xmlcalabash.com/extension/steps/.

Regards,

-- Florent Georges http://fgeorges.org/ http://h2oconsulting.be/

[1]http://fgeorges.blogspot.com/2011/09/writing-extension-step-for-calabash-to....

Christian Grün

5 Sep 5 Sep

5:54 a.m.

...

I've just published a blog post at [1] about writing an extension step in Java for Calabash to use BaseX. Using BaseX is just an excuse, and the post shows more how to write an extension and how to glue the various parts together, but I think it is enough to give the big picture.

..that's great news! I'll have a closer look at both your blog entry and the MarkLogic extension today.

Christian

Christian Grün

6:18 p.m.

Dear Florent,

thanks again for your blog entry. In the following, I have listed two quick alternatives for evaluating XQuery expressions in BaseX. The first version directly communicates with the XQuery processor of BaseX and caches the serialized byte stream (bypassing the string conversion):

import org.basex.core.Context; import org.basex.data.Result; import org.basex.io.serial.Serializer; import org.basex.query.QueryException; import org.basex.query.QueryProcessor; ...

@Override public void run() throws SaxonApiException { super.run();

XdmNode query_doc = mySource.read(); String query_txt = query_doc.getStringValue();

ByteArrayOutputStream baos = new ByteArrayOutputStream(); Context ctx = new Context(); QueryProcessor qp = new QueryProcessor(query_txt, ctx); try { Serializer ser = qp.getSerializer(baos); Result res = qp.execute(); res.serialize(ser); } catch(QueryException ex) { throw new XProcException(ex); } catch(IOException ex) { throw new XProcException(ex); } Source src = new StreamSource(new ByteArrayInputStream(baos.toByteArray()));

DocumentBuilder builder = runtime.getProcessor().newDocumentBuilder(); XdmNode doc = builder.build(src); myResult.write(doc); }

The second variant communicates with the client/server architecture of BaseX:

import org.basex.core.BaseXException; import org.basex.server.ClientSession; ...

@Override public void run() throws SaxonApiException { super.run();

XdmNode query_doc = mySource.read(); String query_txt = query_doc.getStringValue();

try { ClientSession cs = new ClientSession("localhost", 1984, "admin", "admin"); final String result = cs.query(query_txt).execute(); Source src = new StreamSource(new StringReader(result));

DocumentBuilder builder = runtime.getProcessor().newDocumentBuilder(); XdmNode doc = builder.build(src); myResult.write(doc); } catch (IOException ex) { throw new XProcException(ex); } catch (BaseXException ex) { throw new XProcException(ex); } }

In both variants, the result is completely serialized before it is passed on to Saxon's node builder. If the intermediate result gets very large, we could try in a second step to merge the serializer and input stream.

Christian ___________________________

On Mon, Sep 5, 2011 at 11:54 AM, Christian Grün christian.gruen@gmail.com wrote:

...

...
I've just published a blog post at [1] about writing an extension step in Java for Calabash to use BaseX. Using BaseX is just an excuse, and the post shows more how to write an extension and how to glue the various parts together, but I think it is enough to give the big picture.

..that's great news! I'll have a closer look at both your blog entry and the MarkLogic extension today.

Christian

5064

Age (days ago)

5073

Last active (days ago)

basex-talk@mailman.uni-konstanz.de

6 comments

2 participants

tags (0)

participants (2)

Christian Grün
Florent Georges