Formatting attributes as an indented list?
Greetings! I've been tasked with using BaseX to produce: ***** <wg class="cl" rule="S-IO" cltype="VerbElided"> <wg rule="NpaNp" role="s"> <wg type="group" appositioncontainer="true" rule="Np-Appos"> <w ref="PHM 1:1!1" after=" " class="noun" gbiType="proper" xml:id="n57001001001" lemma="Παῦλος" normalized="Παῦλος" strong="3972" number="singular" gender="masculine" case="nominative" gloss="Paul" domain="093001" ln="93.294a" morph="N-NSM" unicode="Παῦλος">Παῦλος</w> ***** The indenting is easy enough and I can even make it deeper if required but is there a command for serialization that will properly format the attributes? My personal suspicion is that inserting \n when each attribute is serialized (and not on the last one) is the easier route but I promised to investigate the command line. Have I overlooked something in the very fine manual? Hope everyone is having a great week! Patrick -- Patrick Durusau patrick@durusau.net Technical Advisory Board, OASIS (TAB) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps) Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau
Hi Patrick, There’s currently no serialization parameter to control the custom indentation of attributes. If I get you correctly, you’d like to get attributes indented if the string length of the element name and the attributes exceed a specific maximum length? Best, Christian On Mon, Feb 13, 2023 at 9:10 PM Patrick Durusau <patrick@durusau.net> wrote:
Greetings!
I've been tasked with using BaseX to produce:
*****
<wg class="cl" rule="S-IO" cltype="VerbElided"> <wg rule="NpaNp" role="s"> <wg type="group" appositioncontainer="true" rule="Np-Appos"> <w ref="PHM 1:1!1" after=" " class="noun" gbiType="proper" xml:id="n57001001001" lemma="Παῦλος" normalized="Παῦλος" strong="3972" number="singular" gender="masculine" case="nominative" gloss="Paul" domain="093001" ln="93.294a" morph="N-NSM" unicode="Παῦλος">Παῦλος</w>
*****
The indenting is easy enough and I can even make it deeper if required but is there a command for serialization that will properly format the attributes?
My personal suspicion is that inserting \n when each attribute is serialized (and not on the last one) is the easier route but I promised to investigate the command line.
Have I overlooked something in the very fine manual?
Hope everyone is having a great week!
Patrick
-- Patrick Durusau patrick@durusau.net Technical Advisory Board, OASIS (TAB) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau
A call from the backbench: I think it would be interesting to have such a serialization option! The esthetic aspect of XML can be important, depending on context. What we get without such an option looks like a heap of information. <wg class="cl" rule="S-IO" cltype="VerbElided"> <wg rule="NpaNp" role="s"> <wg type="group" appositioncontainer="true" rule="Np-Appos"> <w ref="PHM 1:1!1" after=" " class="noun" gbiType="proper" xml:id="n57001001001" lemma="Παῦλος" normalized="Παῦλος" strong="3972" number="singular" gender="masculine" case="nominative" gloss="Paul" domain="093001" ln="93.294a" morph="N-NSM" unicode="Παῦλος">Παῦλος</w> </wg> </wg></wg> Am Dienstag, 14. Februar 2023 um 07:30:46 MEZ hat Christian Grün <christian.gruen@gmail.com> Folgendes geschrieben: Hi Patrick, There’s currently no serialization parameter to control the custom indentation of attributes. If I get you correctly, you’d like to get attributes indented if the string length of the element name and the attributes exceed a specific maximum length? Best, Christian On Mon, Feb 13, 2023 at 9:10 PM Patrick Durusau <patrick@durusau.net> wrote:
Greetings!
I've been tasked with using BaseX to produce:
*****
<wg class="cl" rule="S-IO" cltype="VerbElided"> <wg rule="NpaNp" role="s"> <wg type="group" appositioncontainer="true" rule="Np-Appos"> <w ref="PHM 1:1!1" after=" " class="noun" gbiType="proper" xml:id="n57001001001" lemma="Παῦλος" normalized="Παῦλος" strong="3972" number="singular" gender="masculine" case="nominative" gloss="Paul" domain="093001" ln="93.294a" morph="N-NSM" unicode="Παῦλος">Παῦλος</w>
*****
The indenting is easy enough and I can even make it deeper if required but is there a command for serialization that will properly format the attributes?
My personal suspicion is that inserting \n when each attribute is serialized (and not on the last one) is the easier route but I promised to investigate the command line.
Have I overlooked something in the very fine manual?
Hope everyone is having a great week!
Patrick
-- Patrick Durusau patrick@durusau.net Technical Advisory Board, OASIS (TAB) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau
Christian, Ah, no, it isn't a length of element name + attribute but the ability to align attributes for an element as you see in my post for the <w element. Each key/value is followed by a line return. In the mean time, the current version of tidy has been added to the workflow to produce the desired results. But it would be great to have it native to BaseX! Thanks! Patrick On 2/14/23 01:30, Christian Grün wrote:
Hi Patrick,
There’s currently no serialization parameter to control the custom indentation of attributes.
If I get you correctly, you’d like to get attributes indented if the string length of the element name and the attributes exceed a specific maximum length?
Best, Christian
On Mon, Feb 13, 2023 at 9:10 PM Patrick Durusau <patrick@durusau.net> wrote:
Greetings!
I've been tasked with using BaseX to produce:
*****
<wg class="cl" rule="S-IO" cltype="VerbElided"> <wg rule="NpaNp" role="s"> <wg type="group" appositioncontainer="true" rule="Np-Appos"> <w ref="PHM 1:1!1" after=" " class="noun" gbiType="proper" xml:id="n57001001001" lemma="Παῦλος" normalized="Παῦλος" strong="3972" number="singular" gender="masculine" case="nominative" gloss="Paul" domain="093001" ln="93.294a" morph="N-NSM" unicode="Παῦλος">Παῦλος</w>
*****
The indenting is easy enough and I can even make it deeper if required but is there a command for serialization that will properly format the attributes?
My personal suspicion is that inserting \n when each attribute is serialized (and not on the last one) is the easier route but I promised to investigate the command line.
Have I overlooked something in the very fine manual?
Hope everyone is having a great week!
Patrick
-- Patrick Durusau patrick@durusau.net Technical Advisory Board, OASIS (TAB) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau
-- Patrick Durusau patrick@durusau.net Technical Advisory Board, OASIS (TAB) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau
Hi Patrick I noticed that the attributes for the wg element had not been aligned, so I was wondering if you were thinking of a more advanced rule. Or would you possibly like to supply the names of the elements for which the alignment should take place? Best, Christian Patrick Durusau <patrick@durusau.net> schrieb am Mi., 15. Feb. 2023, 03:51:
Christian,
Ah, no, it isn't a length of element name + attribute but the ability to align attributes for an element as you see in my post for the <w element. Each key/value is followed by a line return.
In the mean time, the current version of tidy has been added to the workflow to produce the desired results.
But it would be great to have it native to BaseX!
Thanks!
Patrick
On 2/14/23 01:30, Christian Grün wrote:
Hi Patrick,
There’s currently no serialization parameter to control the custom indentation of attributes.
If I get you correctly, you’d like to get attributes indented if the string length of the element name and the attributes exceed a specific maximum length?
Best, Christian
On Mon, Feb 13, 2023 at 9:10 PM Patrick Durusau <patrick@durusau.net> wrote:
Greetings!
I've been tasked with using BaseX to produce:
*****
<wg class="cl" rule="S-IO" cltype="VerbElided"> <wg rule="NpaNp" role="s"> <wg type="group" appositioncontainer="true" rule="Np-Appos"> <w ref="PHM 1:1!1" after=" " class="noun" gbiType="proper" xml:id="n57001001001" lemma="Παῦλος" normalized="Παῦλος" strong="3972" number="singular" gender="masculine" case="nominative" gloss="Paul" domain="093001" ln="93.294a" morph="N-NSM" unicode="Παῦλος">Παῦλος</w>
*****
The indenting is easy enough and I can even make it deeper if required but is there a command for serialization that will properly format the attributes?
My personal suspicion is that inserting \n when each attribute is serialized (and not on the last one) is the easier route but I promised to investigate the command line.
Have I overlooked something in the very fine manual?
Hope everyone is having a great week!
Patrick
-- Patrick Durusau patrick@durusau.net Technical Advisory Board, OASIS (TAB) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau
-- Patrick Durusau patrick@durusau.net Technical Advisory Board, OASIS (TAB) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau
Hi Christian, Currently, I am using HTML tidy to reformat the XML output. It gives me the formatting I need, which is Git-diff friendly. Jonathan $ nodes % tidy --version HTML Tidy for Apple macOS version 5.6.0 $ nodes % tidy -config tidy.config 03-luke.xml Sample Output: <?xml version="1.0"?> <Sentences> <Sentence ref="LUK 1:1!1-1:4!8"> <Trees> <Tree> <Node Cat="S" Head="0" nodeId="420010010010421"> <Node Cat="CL" Start="0" End="41" Rule="ClCl" Head="0" nodeId="420010010010420"> <Node Cat="CL" Start="0" End="33" Rule="ClCl" Head="0" nodeId="420010010010340"> <Node Cat="CL" Start="0" End="31" Rule="ClCl2" Head="1" nodeId="420010010010320"> <Node Cat="CL" Start="0" End="22" Rule="sub-CL" nodeId="420010010010230"> <Node xml:id="n42001001001" ref="LUK 1:1!1" Cat="conj" Start="0" End="0" StrongNumber="1895" UnicodeLemma="ἐπειδήπερ" FunctionalTag="CONJ" Type="" morphId="42001001001" NormalizedForm="Ἐπειδήπερ" Unicode="Ἐπειδήπερ" FormalTag="CONJ" tidy.config add-xml-decl: true drop-empty-paras: false fix-backslash: false fix-bad-comments: false fix-uri: false input-xml: true join-styles: false literal-attributes: true lower-literals: false output-xml: true preserve-entities: true quote-ampersand: false quote-marks: false quote-nbsp: false indent: auto indent-attributes: true indent-spaces: 4 tab-size: 4 vertical-space: true wrap: 150 char-encoding: utf8 input-encoding: utf8 newline: CRLF output-encoding: utf8 quiet: true On Wed, Feb 15, 2023 at 3:06 AM Christian Grün <christian.gruen@gmail.com> wrote:
Hi Patrick
I noticed that the attributes for the wg element had not been aligned, so I was wondering if you were thinking of a more advanced rule.
Or would you possibly like to supply the names of the elements for which the alignment should take place?
Best, Christian
Patrick Durusau <patrick@durusau.net> schrieb am Mi., 15. Feb. 2023, 03:51:
Christian,
Ah, no, it isn't a length of element name + attribute but the ability to align attributes for an element as you see in my post for the <w element. Each key/value is followed by a line return.
In the mean time, the current version of tidy has been added to the workflow to produce the desired results.
But it would be great to have it native to BaseX!
Thanks!
Patrick
On 2/14/23 01:30, Christian Grün wrote:
Hi Patrick,
There’s currently no serialization parameter to control the custom indentation of attributes.
If I get you correctly, you’d like to get attributes indented if the string length of the element name and the attributes exceed a specific maximum length?
Best, Christian
On Mon, Feb 13, 2023 at 9:10 PM Patrick Durusau <patrick@durusau.net> wrote:
Greetings!
I've been tasked with using BaseX to produce:
*****
<wg class="cl" rule="S-IO" cltype="VerbElided"> <wg rule="NpaNp" role="s"> <wg type="group" appositioncontainer="true" rule="Np-Appos"> <w ref="PHM 1:1!1" after=" " class="noun" gbiType="proper" xml:id="n57001001001" lemma="Παῦλος" normalized="Παῦλος" strong="3972" number="singular" gender="masculine" case="nominative" gloss="Paul" domain="093001" ln="93.294a" morph="N-NSM" unicode="Παῦλος">Παῦλος</w>
*****
The indenting is easy enough and I can even make it deeper if required but is there a command for serialization that will properly format the attributes?
My personal suspicion is that inserting \n when each attribute is serialized (and not on the last one) is the easier route but I promised to investigate the command line.
Have I overlooked something in the very fine manual?
Hope everyone is having a great week!
Patrick
-- Patrick Durusau patrick@durusau.net Technical Advisory Board, OASIS (TAB) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau
-- Patrick Durusau patrick@durusau.net Technical Advisory Board, OASIS (TAB) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau
Hi Jonathan, Thanks for sharing your tidy settings. With the given configuration, all attributes except for the first are returned in a separate line… <wg class="cl" rule="S-IO" cltype="VerbElided"> In Patrick’s example, some attributes were returned in a single line (possibly depending on the expected string length). Maybe it was generated via Saxon (just a guess): <wg class="cl" rule="S-IO" cltype="VerbElided"> Do you have a preference which representation would be required, or do you think the details are not that relevant? We could possibly add a custom serialization parameter similar to tidy’s 'indent-attributes' option, and it would probably be easier to ignore the expected string length. All the best, Christian
Hi Christian, I prefer to be able to require one attribute per line. This is important for Git diffs, which are the main reason we care. Jonathan On Wed, Feb 15, 2023 at 11:31 AM Christian Grün <christian.gruen@gmail.com> wrote:
Hi Jonathan,
Thanks for sharing your tidy settings.
With the given configuration, all attributes except for the first are returned in a separate line…
<wg class="cl" rule="S-IO" cltype="VerbElided">
In Patrick’s example, some attributes were returned in a single line (possibly depending on the expected string length). Maybe it was generated via Saxon (just a guess):
<wg class="cl" rule="S-IO" cltype="VerbElided">
Do you have a preference which representation would be required, or do you think the details are not that relevant?
We could possibly add a custom serialization parameter similar to tidy’s 'indent-attributes' option, and it would probably be easier to ignore the expected string length.
All the best, Christian
Hi Jonathan, I think we can offer you a solution soon. I have created a GitHub issue to document the progress [1]. All the best, Christian [1] https://github.com/BaseXdb/basex/issues/2174 Jonathan Robie <jonathan.robie@gmail.com> schrieb am Mi., 15. Feb. 2023, 19:05:
Hi Christian,
I prefer to be able to require one attribute per line. This is important for Git diffs, which are the main reason we care.
Jonathan
On Wed, Feb 15, 2023 at 11:31 AM Christian Grün <christian.gruen@gmail.com> wrote:
Hi Jonathan,
Thanks for sharing your tidy settings.
With the given configuration, all attributes except for the first are returned in a separate line…
<wg class="cl" rule="S-IO" cltype="VerbElided">
In Patrick’s example, some attributes were returned in a single line (possibly depending on the expected string length). Maybe it was generated via Saxon (just a guess):
<wg class="cl" rule="S-IO" cltype="VerbElided">
Do you have a preference which representation would be required, or do you think the details are not that relevant?
We could possibly add a custom serialization parameter similar to tidy’s 'indent-attributes' option, and it would probably be easier to ignore the expected string length.
All the best, Christian
Hi Jonathan, hi Patrick, The new serialization parameter 'indent-attributes' is already available [1]: (: provided globally :) declare option output:indent 'yes'; declare option output:indent-attributes 'yes'; <e a='a' b='b' c='c'/> (: provided locally :) serialize( <e a='a' b='b' c='c'/>, map { 'indent-attributes': true(), 'indent': true() } ) Result: <e a="a" b="b" c="c"/> Thank you to Gunther Rademacher, who contributed the code solution! A new stable snapshot is available [2]. The serialization parameter may officially be supported with XQuery 4 [3]. Hope this helps, Christian [1] https://docs.basex.org/wiki/Serialization [2] https://files.basex.org/releases/latest/ [3] https://github.com/qt4cg/qtspecs/issues/358#issuecomment-1436595401 On Wed, Feb 15, 2023 at 7:05 PM Jonathan Robie <jonathan.robie@gmail.com> wrote:
Hi Christian,
I prefer to be able to require one attribute per line. This is important for Git diffs, which are the main reason we care.
Jonathan
On Wed, Feb 15, 2023 at 11:31 AM Christian Grün <christian.gruen@gmail.com> wrote:
Hi Jonathan,
Thanks for sharing your tidy settings.
With the given configuration, all attributes except for the first are returned in a separate line…
<wg class="cl" rule="S-IO" cltype="VerbElided">
In Patrick’s example, some attributes were returned in a single line (possibly depending on the expected string length). Maybe it was generated via Saxon (just a guess):
<wg class="cl" rule="S-IO" cltype="VerbElided">
Do you have a preference which representation would be required, or do you think the details are not that relevant?
We could possibly add a custom serialization parameter similar to tidy’s 'indent-attributes' option, and it would probably be easier to ignore the expected string length.
All the best, Christian
Christian, Thanks indeed to both you and Gunther Rademacher! Excellent work! Patrick On 2/21/23 06:41, Christian Grün wrote:
Hi Jonathan, hi Patrick,
The new serialization parameter 'indent-attributes' is already available [1]:
(: provided globally :) declare option output:indent 'yes'; declare option output:indent-attributes 'yes'; <e a='a' b='b' c='c'/>
(: provided locally :) serialize( <e a='a' b='b' c='c'/>, map { 'indent-attributes': true(), 'indent': true() } )
Result: <e a="a" b="b" c="c"/>
Thank you to Gunther Rademacher, who contributed the code solution!
A new stable snapshot is available [2]. The serialization parameter may officially be supported with XQuery 4 [3].
Hope this helps, Christian
[1] https://docs.basex.org/wiki/Serialization [2] https://files.basex.org/releases/latest/ [3] https://github.com/qt4cg/qtspecs/issues/358#issuecomment-1436595401
On Wed, Feb 15, 2023 at 7:05 PM Jonathan Robie <jonathan.robie@gmail.com> wrote:
Hi Christian,
I prefer to be able to require one attribute per line. This is important for Git diffs, which are the main reason we care.
Jonathan
On Wed, Feb 15, 2023 at 11:31 AM Christian Grün <christian.gruen@gmail.com> wrote:
Hi Jonathan,
Thanks for sharing your tidy settings.
With the given configuration, all attributes except for the first are returned in a separate line…
<wg class="cl" rule="S-IO" cltype="VerbElided">
In Patrick’s example, some attributes were returned in a single line (possibly depending on the expected string length). Maybe it was generated via Saxon (just a guess):
<wg class="cl" rule="S-IO" cltype="VerbElided">
Do you have a preference which representation would be required, or do you think the details are not that relevant?
We could possibly add a custom serialization parameter similar to tidy’s 'indent-attributes' option, and it would probably be easier to ignore the expected string length.
All the best, Christian
-- Patrick Durusau patrick@durusau.net Technical Advisory Board, OASIS (TAB) Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300 Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps) Another Word For It (blog): http://tm.durusau.net Homepage: http://www.durusau.net Twitter: patrickDurusau
participants (4)
-
Christian Grün -
Hans-Juergen Rennau -
Jonathan Robie -
Patrick Durusau