Support for performing queries on query results

List overview All Threads
Download

newer

older

Incompatibility with Tomcat 10

copy-paste, can't copy in BaseX...

Markus Elfring

1 Apr 2022 1 Apr '22

6:15 a.m.

Hello,

XQuery scripts can be used also to compute values from selected XML data structures. Query results can be exported then into other file formats. But I would like to reuse query results for further queries directly.

The SQL standard is providing named views for this purpose.

How will the support evolve for such data processing requirements with XML tools?

Regards, Markus

Show replies by date

Christian Grün

1 Apr 1 Apr

7:03 a.m.

Hi Markus,

...

XQuery scripts can be used also to compute values from selected XML data structures. Query results can be exported then into other file formats. But I would like to reuse query results for further queries directly.

One common approach is to create volatile database instances from from query results (databases in BaseX are prety lightweight).

How does your script look like?

Best, Christian

Markus Elfring

8:44 a.m.

...

...
But I would like to reuse query results for further queries directly.

One common approach is to create volatile database instances from from query results (databases in BaseX are prety lightweight).

I would prefer to avoid the creation of another BaseX database for each query result.

...

How does your script look like?

I suggest to clarify software design consequences better according to the execution of queries which determine values for subsequent queries.

I guess that a popular use case is to count items also in XML data structures. The computed numbers would be passed then to aggregate functions for further data analyses, wouldn't they?

Regards, Markus

Christian Grün

8:47 a.m.

...

I would prefer to avoid the creation of another BaseX database for each query result.

Just fine as well. If you use XQuery, you can bind your query results to variables for further processing.

...

I guess that a popular use case is to count items also in XML data structures. The computed numbers would be passed then to aggregate functions for further data analyses, wouldn't they?

Some more insight into what you’ve done so far would be helpful.

Markus Elfring

9:15 a.m.

...

...
I would prefer to avoid the creation of another BaseX database for each query result.

Just fine as well. If you use XQuery, you can bind your query results to variables for further processing.

I guess that it is expected that this kind of query result binding would work only with XML data structures. Can any XQuery scripts be referenced by known file names at such places?

Regards, Markus

Christian Grün

9:22 a.m.

...

I guess that it is expected that this kind of query result binding would work only with XML data structures.

You can bind arbitrary XQuery values to variables (XML, strings, numbers, binaries, maps, arrays, function items, and others).

...

Can any XQuery scripts be referenced by known file names at such places?

Yes, you can e.g. use xquery:eval for that purpose [1]. The result can again be bound to a variable and further processed. It can also be written to a file, sent to any remote source, an SQL database, or wherever you need it. You can regard XQuery as full information processing language, which includes database core features that resemble SQL.

[1] https://docs.basex.org/wiki/XQuery_Module#xquery:eval

Markus Elfring

4 Apr 4 Apr

4:28 a.m.

...

...
Can any XQuery scripts be referenced by known file names at such places?

Yes, you can e.g. use xquery:eval for that purpose [1]. The result can again be bound to a variable and further processed. It can also be written to a file, sent to any remote source, an SQL database, or wherever you need it.

Thanks for your hints.

...

You can regard XQuery as full information processing language, which includes database core features that resemble SQL.

I am trying also to become more familiar with the capabilities of the programming language “XQuery” (and corresponding software libraries).

I achieved some data processing results together with mutable data structures. Now I am looking more at the construction of data structures by the means of functional algorithms. I would appreciate further advices in this design area.

How do you think about to perform queries by grouping data from arrays (or maps) besides the extraction of contents from XML subtrees?

Regards, Markus

Christian Grün

4:54 a.m.

...

I am trying also to become more familiar with the capabilities of the programming language “XQuery” (and corresponding software libraries).

I achieved some data processing results together with mutable data structures. Now I am looking more at the construction of data structures by the means of functional algorithms. I would appreciate further advices in this design area.

There are very good books on XQuery. I can recommend the following two:

https://www.oreilly.com/library/view/xquery-2nd-edition/9781491915080/ https://www.tamupress.com/book/9781623498290/

It takes some time to grasp the functional nature of the language, but it’s absolutely worth the time.

...

How do you think about to perform queries by grouping data from arrays (or maps) besides the extraction of contents from XML subtrees?

With XQuery 3.0, a group by clause was introduced [1, 2]. I can be applied to all data structures including arrays (“sequences” are the most basic data structure in XQuery, though).

[1] https://docs.basex.org/wiki/XQuery_3.0#group_by [2] https://www.wilfried-grupe.de/XQuery18.html

Markus Elfring

7 a.m.

...

With XQuery 3.0, a group by clause was introduced [1, 2]. I can be applied to all data structures including arrays (“sequences” are the most basic data structure in XQuery, though).

I am looking for further hints in this software design area.

The following script got parsed.

declare option output:method "csv"; declare option output:csv "header=yes, separator=|"; let $results := for $x in //test_data/product return array { fn:count($x/contributor), $x/id/data() }

for $x in $results let $count := $x[1] let $incidence := fn:count($count) group by $count order by $incidence descending return <csv> <record> <contributor_count>{$count}</contributor_count> <incidence>{$incidence}</incidence> </record> </csv>

But I stumble on the message “[XPTY0004] Item expected, sequence found: (1, "123").” from the query evaluation. https://docs.basex.org/wiki/XQuery_Errors#Type_Errors

Do you find the presented data processing approach reasonable (in principle)?

Regards, Markus

Martin Honnen

7:19 a.m.

Am 04.04.2022 um 13:00 schrieb Markus Elfring:

...

...
With XQuery 3.0, a group by clause was introduced [1, 2]. I can be applied to all data structures including arrays (“sequences” are the most basic data structure in XQuery, though).

I am looking for further hints in this software design area.

The following script got parsed.

declare option output:method "csv"; declare option output:csv "header=yes, separator=|"; let $results := for $x in //test_data/product return array { fn:count($x/contributor), $x/id/data() }

for $x in $results let $count := $x[1]

Perhaps you want

$x(1)

here to access the first item in the array $x?

...

let $incidence := fn:count($count) group by $count order by $incidence descending return

<csv> <record> <contributor_count>{$count}</contributor_count> <incidence>{$incidence}</incidence> </record> </csv>

But I stumble on the message “[XPTY0004] Item expected, sequence found: (1, "123").” from the query evaluation. https://docs.basex.org/wiki/XQuery_Errors#Type_Errors

Do you find the presented data processing approach reasonable (in principle)?

Regards, Markus

Markus Elfring

9:27 a.m.

...

...
for $x in $results let $count := $x[1]

Perhaps you want

$x(1)

here to access the first item in the array $x?

Yes. ‒ Thanks that you pointed a typo out.

Thus I adjusted the member access specification. But I stumble on the message “[XPTY0004] Item expected, sequence found: (1, 1, 1)” from the query evaluation.

Which script fine-tuning will help then?

Regards, Markus

Martin Honnen

9:35 a.m.

Am 04.04.2022 um 15:27 schrieb Markus Elfring:

...

...
...
for $x in $results let $count := $x[1]

Perhaps you want

$x(1)

here to access the first item in the array $x?

Yes. ‒ Thanks that you pointed a typo out.

Thus I adjusted the member access specification. But I stumble on the message “[XPTY0004] Item expected, sequence found: (1, 1, 1)” from the query evaluation.

Which script fine-tuning will help then?

How about showing sample input and output data?

I would guess, that for the grouping you rather want

declare option output:csv "header=yes, separator=|"; let $results := for $x in //test_data/product return array { fn:count($x/contributor), $x/id/data() }

for $x in $results

group by $count := $x(1)

Markus Elfring

10:16 a.m.

...

for $x in $results

group by $count := $x(1)

I tried the code variant “for $r in $results let $count := $r(1)” also out. But I wonder that the incidence “1” (only one) would be determined then for all record set counters.

An other simple query is working as expected.

id|contributor_count 123|1 45|1 67|1 89|2

Thus I would expect a data display like the following for the discussed query variant.

contributor_count|incidence 1|3 2|1

Regards, Markus

Martin Honnen

11 a.m.

Am 04.04.2022 um 16:16 schrieb Markus Elfring:

...

...
for $x in $results

group by $count := $x(1)

I tried the code variant “for $r in $results let $count := $r(1)” also out. But I wonder that the incidence “1” (only one) would be determined then for all record set counters.

As I said, consider to provide an input/output sample of what you have and want to achieve, don't make us guess from your failing queries what input data you have and which result you want.

A further guess would be that perhaps

for $x in $results

group by $count := $x(1)

let $incidence := count($x)

with the way XQuery binds variables might help as that way the $incidence what have the values of items in each group established by the `group by` clause on the `for $x`.

Markus Elfring

11:30 a.m.

...

As I said, consider to provide an input/output sample of what you have and want to achieve, …

I did that a moment ago. https://mailman.uni-konstanz.de/pipermail/basex-talk/2022-April/017045.html

A simple query can provide the following data display.

id|contributor_count 123|1 45|1 67|1 89|2

These data should be taken for another analysis.

contributor_count|incidence 1|3 2|1

This would be the expected output for such a test case.

I became curious if such a report result could be achieved also without the generation of the first table.

Regards, Markus

Tamara Marnell

2:50 p.m.

Hello Markus,

In your query, you're grouping only by $count but also returning $incidence. When you group by $count, you're creating a sequence of unique count values each associated with a sequence of $incidences. To illustrate:

<contributors>{ for $x in $results let $count := $x(1) let $incidence := count($count) group by $count return <contributor count="{$count}">{ for $i in $incidence return <incidence>{$i}</incidence> }</contributor> }</contributors>

Will result in:

So when you attempt to print out <incidence>{$incidence}</incidence>, XQuery finds a sequence (1, 1, 1) instead of the value you want, 3.

To get the count of $incidence, you could return <incidence>{count($incidence)}</incidence> instead. But since you're not using the IDs at all, placing them in a sequence of arrays first isn't necessary. You can skip that part like:

for $x in //test_data/product let $count := count($x/contributor) let $id := $x/id/data() group by $count let $incidence := count($id) order by $incidence descending return <csv> <record> <contributor_count>{$count}</contributor_count> <incidence>{$incidence}</incidence> </record> </csv>

-Tamara

On Mon, Apr 4, 2022 at 8:30 AM Markus Elfring Markus.Elfring@web.de wrote:

...

...
As I said, consider to provide an input/output sample of what you have and want to achieve, …

I did that a moment ago. https://mailman.uni-konstanz.de/pipermail/basex-talk/2022-April/017045.html

A simple query can provide the following data display.

id|contributor_count 123|1 45|1 67|1 89|2

These data should be taken for another analysis.

contributor_count|incidence 1|3 2|1

This would be the expected output for such a test case.

I became curious if such a report result could be achieved also without the generation of the first table.

Regards, Markus

-- Tamara Marnell Program Manager, Systems Orbis Cascade Alliance (orbiscascade.org https://www.orbiscascade.org/) Pronouns: she/her/hers

Imsieke, Gerrit, le-tex

3:33 p.m.

I don’t want to do injustice to all the other valuable contributors to this list, but…

Tamara, you are the most valuable contributor of 2022 so far. Not only for your competence, but also for your patience. (Christian will always be (co-) leading in terms of competence and patience, but I think he will agree that your contributions to this list matter a lot already.)

To my surprise and unless I’m overlooking something, you didn’t leave a trace in the archives [1] in previous years.

Thank you for becoming a significant part of this community!

Gerrit

[1] https://mailman.uni-konstanz.de/pipermail/basex-talk/

On 04.04.2022 20:50, Tamara Marnell wrote:

...

Hello Markus,

In your query, you're grouping only by $count but also returning $incidence. When you group by $count, you're creating a sequence of unique count values each associated with a sequence of $incidences. To illustrate:

<contributors>{ for $x in $results let $count := $x(1) let $incidence := count($count) group by $count return <contributor count="{$count}">{ for $i in $incidence return <incidence>{$i}</incidence> }</contributor> }</contributors>

Will result in:

<contributors> <contributor count="1"> <incidence>1</incidence> <incidence>1</incidence> <incidence>1</incidence> </contributor> <contributor count="2"> <incidence>1</incidence> </contributor> </contributors>

So when you attempt to print out <incidence>{$incidence}</incidence>, XQuery finds a sequence (1, 1, 1) instead of the value you want, 3.

To get the count of $incidence, you could return <incidence>{count($incidence)}</incidence> instead. But since you're not using the IDs at all, placing them in a sequence of arrays first isn't necessary. You can skip that part like:

for $x in //test_data/product let $count := count($x/contributor) let $id := $x/id/data() group by $count let $incidence := count($id) order by $incidence descending return

<csv> <record> <contributor_count>{$count}</contributor_count> <incidence>{$incidence}</incidence> </record> </csv>

-Tamara

On Mon, Apr 4, 2022 at 8:30 AM Markus Elfring <Markus.Elfring@web.de mailto:Markus.Elfring@web.de> wrote:
 > As I said, consider to provide an input/output sample of what you
have
 > and want to achieve, …

I did that a moment ago.
https://mailman.uni-konstanz.de/pipermail/basex-talk/2022-April/017045.html
<https://mailman.uni-konstanz.de/pipermail/basex-talk/2022-April/017045.html>


A simple query can provide the following data display.

id|contributor_count
123|1
45|1
67|1
89|2


These data should be taken for another analysis.

contributor_count|incidence
1|3
2|1


This would be the expected output for such a test case.

I became curious if such a report result could be achieved also without
the generation of the first table.

Regards,
Markus
--

Tamara Marnell Program Manager, Systems Orbis Cascade Alliance (orbiscascade.org https://www.orbiscascade.org/) Pronouns: she/her/hers

-- Gerrit Imsieke Geschäftsführer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@le-tex.de, http://www.le-tex.de Registergericht / Commercial Register: Amtsgericht Leipzig Registernummer / Registration Number: HRB 24930 Geschäftsführer / Managing Directors: Gerrit Imsieke, Svea Jelonek, Thomas Schmidt

Markus Elfring

5 Apr 5 Apr

8:55 a.m.

...

So when you attempt to print out <incidence>{$incidence}</incidence>, XQuery finds a sequence (1, 1, 1) instead of the value you want, 3.

Dear Tamara,

I found this test result surprising.

...

To get the count of $incidence, you could return <incidence>{count($incidence)}</incidence> instead.

My understanding is still evolving also for implementation details according to FLWOR expressions.

...

But since you're not using the IDs at all, placing them in a sequence of arrays first isn't necessary.

I imagined that I would need a subquery. Thus I found the specification of an identification relevant (for a moment).

...

You can skip that part like:

for $x in //test_data/product let $count := count($x/contributor) let $id := $x/id/data() group by $count let $incidence := count($id)

My knowledge was improvable also for desirable variable bindings.

...

order by $incidence descending return

<csv> <record> <contributor_count>{$count}</contributor_count> <incidence>{$incidence}</incidence> </record> </csv>

I thank you very much for the demonstration of a report approach which is working as expected finally.

Will similar use cases become interesting for further clarification?

Regards, Markus

Eliot Kimber

4 Apr 4 Apr

9:40 a.m.

Marcus,

I second Christian’s recommendation of XQuery for Humanists—I found it to be an excellent introductory text for someone coming to XQuery entirely new. Pricilla’s book is more useful as an authoritative reference and guide but I think XQuery for Humanists is the better entry point.

To answer your specific question: The message indicates that the expression expected a single item (i.e., a node or atomic value) but it got a sequence of items.

One immediate suggestion is to use “as” qualifiers on all your variables and function parameter declarations. This makes is clear what the intended data type is an allows the XQuery interpreter to detect type errors at compile time.

So:

let $x as array(*) := …

let $tokens as xs:string* := tokenize(‘foo bar’)

Cheers,

Eliot _____________________________________________ Eliot Kimber Sr Staff Content Engineer O: 512 554 9368 M: 512 554 9368 servicenow.comhttps://www.servicenow.com LinkedInhttps://www.linkedin.com/company/servicenow | Twitterhttps://twitter.com/servicenow | YouTubehttps://www.youtube.com/user/servicenowinc | Facebookhttps://www.facebook.com/servicenow

From: BaseX-Talk basex-talk-bounces@mailman.uni-konstanz.de on behalf of Markus Elfring Markus.Elfring@web.de Date: Monday, April 4, 2022 at 8:27 AM To: Martin Honnen martin.honnen@gmx.de Cc: basex-talk@mailman.uni-konstanz.de basex-talk@mailman.uni-konstanz.de Subject: Re: [basex-talk] Support for performing queries on query results [External Email]

...

...
for $x in $results let $count := $x[1]

Perhaps you want

$x(1)

here to access the first item in the array $x?

Yes. ‒ Thanks that you pointed a typo out.

Thus I adjusted the member access specification. But I stumble on the message “[XPTY0004] Item expected, sequence found: (1, 1, 1)” from the query evaluation.

Which script fine-tuning will help then?

Regards, Markus

1230

Age (days ago)

1234

Last active (days ago)

basex-talk@mailman.uni-konstanz.de

18 comments

6 participants

tags (0)

participants (6)

Christian Grün
Eliot Kimber
Imsieke, Gerrit, le-tex
Markus Elfring
Martin Honnen
Tamara Marnell