Hi,
I'm currently converting my project to use BaseX instead of Saxon. One thing you can do in Saxon is provide a flag (an exclamation mark) to your regular expression to tell the matches function to use the Java regular expression processor, rather than the rather cut down expressions available in the XQuery spec.
Is there anything similar in BaseX?
If not what do you recommend to define a Java regular expression based function for XQuery?
Thanks in advance, Gary
Hi Gary,
not directly a flag, but you are for sure able to use Java classes from within BaseX like so https://gist.github.com/c72fad2758af668eb0f1
More information on our Java Bindings can be found here: http://docs.basex.org/wiki/Java_Bindings
I hope this helps, feel free to ask for more help =)
Michael Am 13.10.2012 um 16:10 schrieb The Trainspotter wys01@btinternet.com:
Hi,
I'm currently converting my project to use BaseX instead of Saxon. One thing you can do in Saxon is provide a flag (an exclamation mark) to your regular expression to tell the matches function to use the Java regular expression processor, rather than the rather cut down expressions available in the XQuery spec.
Is there anything similar in BaseX?
If not what do you recommend to define a Java regular expression based function for XQuery?
Thanks in advance, Gary _______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Hi,
Yes, this helps. I did go down a similar route with saxon before I found the exclamation mark flag. The issue I had was that using the Java bindings in Saxon was very inefficient when calculating matches for large number of attributes. This may have been my fault for not compiling the regex outside of the "for" statement, but I remember being relieved to find the magic flag to switch to Java regular expressions. I'll try out the github reference and see how I get on.
Is this switch something I could request as an enhancement, if so could I expect it to be implemented any time soon?
Thanks for you help, Gary
________________________________ From: Michael Seiferle ms@basex.org To: The Trainspotter wys01@btinternet.com Cc: "basex-talk@mailman.uni-konstanz.de" basex-talk@mailman.uni-konstanz.de Sent: Monday, 15 October 2012, 8:58 Subject: Re: [basex-talk] Using full Java regular expressions
Hi Gary,
not directly a flag, but you are for sure able to use Java classes from within BaseX like so https://gist.github.com/c72fad2758af668eb0f1
More information on our Java Bindings can be found here: http://docs.basex.org/wiki/Java_Bindings
I hope this helps, feel free to ask for more help =)
Michael
Am 13.10.2012 um 16:10 schrieb The Trainspotter wys01@btinternet.com:
Hi,
I'm currently converting my project to use BaseX instead of Saxon. One thing you can do in Saxon is provide a flag (an exclamation mark) to your regular expression to tell the matches function to use the Java regular expression processor, rather than the rather cut down expressions available in the XQuery spec.
Is there anything similar in BaseX?
If not what do you recommend to define a Java regular expression based function for XQuery?
Thanks in advance, Gary_______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Am 15.10.2012 um 15:35 schrieb The Trainspotter wys01@btinternet.com:
Is this switch something I could request as an enhancement,
Yes
if so could I expect it to be implemented any time soon?
...not necessarily as I can not promise that someone will jump on this train anytime soon. BUT it is for sure the first step towards completion :-))
Hope this helps Michael
Hi Gary,
BaseX provides the full XQuery 3.0 regular expression syntax [1,2]; maybe it already contains the features you need for your queries? If not, could you give us a hint which ones you are missing?
While we could add an additional flag to the regex evaluator in BaseX, we are generally hesitant to do so, because it would be yet another vendor (i.e., Saxon and BaseX)-specific extension.
Best, Christian
[1] http://www.w3.org/TR/xpath-functions-30/#regex-syntax [2] http://www.w3.org/TR/xmlschema-2/#regexs ___________________________
I'm currently converting my project to use BaseX instead of Saxon. One thing you can do in Saxon is provide a flag (an exclamation mark) to your regular expression to tell the matches function to use the Java regular expression processor, rather than the rather cut down expressions available in the XQuery spec.
Is there anything similar in BaseX?
If not what do you recommend to define a Java regular expression based function for XQuery?
Thanks in advance, Gary
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Hi Christian,
The regular expression capability I was missing was the word boundary \b matching. I followed the Java bindings example so I can now use the Java String.matches() function which allows me to use the \b match (and others too) which are not part of the standard regex capability. This performs very well, so I think you can hold off adding another extension.
Cheers, Gary
________________________________ From: Christian Grün christian.gruen@gmail.com To: The Trainspotter wys01@btinternet.com Cc: "basex-talk@mailman.uni-konstanz.de" basex-talk@mailman.uni-konstanz.de Sent: Sunday, 21 October 2012, 18:23 Subject: Re: [basex-talk] Using full Java regular expressions
Hi Gary,
BaseX provides the full XQuery 3.0 regular expression syntax [1,2]; maybe it already contains the features you need for your queries? If not, could you give us a hint which ones you are missing?
While we could add an additional flag to the regex evaluator in BaseX, we are generally hesitant to do so, because it would be yet another vendor (i.e., Saxon and BaseX)-specific extension.
Best, Christian
[1] http://www.w3.org/TR/xpath-functions-30/#regex-syntax [2] http://www.w3.org/TR/xmlschema-2/#regexs ___________________________
I'm currently converting my project to use BaseX instead of Saxon. One thing you can do in Saxon is provide a flag (an exclamation mark) to your regular expression to tell the matches function to use the Java regular expression processor, rather than the rather cut down expressions available in the XQuery spec.
Is there anything similar in BaseX?
If not what do you recommend to define a Java regular expression based function for XQuery?
Thanks in advance, Gary
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Hi Gary,
word boundaries are nothing but sugar to regex expressions for engines supporting lookahead and -behind. They're defined by [1] as all positions
- Before the first character in the string, if the first character is a word character.
- After the last character in the string, if the last character is a word character.
- Between two characters in the string, where one is a word character and the other is not a word character.
This can easily be written as
((?<=\w)(?!\w)|(?<!\w)(?=\w))
which actually describes the third rule, but `$` and `^` are "non-word-characters" anyway.
Using non-XQuery-functions (as calling Java from XQuery) will prevent future (hopefully soon) performance optimizations regarding parallel execution, better stick to the XQuery's default regex whenever possible.
Kind regards from Lake Constance, Germany, Jens Erat
[1]: http://www.regular-expressions.info/wordboundaries.html
The regular expression capability I was missing was the word boundary \b matching. I followed the Java bindings example so I can now use the Java String.matches() function which allows me to use the \b match (and others too) which are not part of the standard regex capability. This performs very well, so I think you can hold off adding another extension.
Sounds fine. By the way, if you wish \b to be included in a future version of XQuery, you are invited to post a feature request on the W3 Bugzilla list [1].
[1] https://www.w3.org/Bugs/Public/query.cgi
From: Christian Grün christian.gruen@gmail.com
To: The Trainspotter wys01@btinternet.com Cc: "basex-talk@mailman.uni-konstanz.de" basex-talk@mailman.uni-konstanz.de Sent: Sunday, 21 October 2012, 18:23
Subject: Re: [basex-talk] Using full Java regular expressions
Hi Gary,
BaseX provides the full XQuery 3.0 regular expression syntax [1,2]; maybe it already contains the features you need for your queries? If not, could you give us a hint which ones you are missing?
While we could add an additional flag to the regex evaluator in BaseX, we are generally hesitant to do so, because it would be yet another vendor (i.e., Saxon and BaseX)-specific extension.
Best, Christian
[1] http://www.w3.org/TR/xpath-functions-30/#regex-syntax [2] http://www.w3.org/TR/xmlschema-2/#regexs ___________________________
I'm currently converting my project to use BaseX instead of Saxon. One thing you can do in Saxon is provide a flag (an exclamation mark) to your regular expression to tell the matches function to use the Java regular expression processor, rather than the rather cut down expressions available in the XQuery spec.
Is there anything similar in BaseX?
If not what do you recommend to define a Java regular expression based function for XQuery?
Thanks in advance, Gary
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
basex-talk@mailman.uni-konstanz.de