Hi Christian,

Currently, I am using HTML tidy to reformat the XML output.  It gives me the formatting I need, which is Git-diff friendly.

Jonathan


$ nodes % tidy --version
HTML Tidy for Apple macOS version 5.6.0

$ nodes % tidy -config tidy.config  03-luke.xml

Sample Output:

<?xml version="1.0"?>
<Sentences>
    <Sentence ref="LUK 1:1!1-1:4!8">
        <Trees>
            <Tree>
                <Node Cat="S"
                      Head="0"
                      nodeId="420010010010421">
                    <Node Cat="CL"
                          Start="0"
                          End="41"
                          Rule="ClCl"
                          Head="0"
                          nodeId="420010010010420">
                        <Node Cat="CL"
                              Start="0"
                              End="33"
                              Rule="ClCl"
                              Head="0"
                              nodeId="420010010010340">
                            <Node Cat="CL"
                                  Start="0"
                                  End="31"
                                  Rule="ClCl2"
                                  Head="1"
                                  nodeId="420010010010320">
                                <Node Cat="CL"
                                      Start="0"
                                      End="22"
                                      Rule="sub-CL"
                                      nodeId="420010010010230">
                                    <Node xml:id="n42001001001"
                                          ref="LUK 1:1!1"
                                          Cat="conj"
                                          Start="0"
                                          End="0"
                                          StrongNumber="1895"
                                          UnicodeLemma="ἐπειδήπερ"
                                          FunctionalTag="CONJ"
                                          Type=""
                                          morphId="42001001001"
                                          NormalizedForm="Ἐπειδήπερ"
                                          Unicode="Ἐπειδήπερ"
                                          FormalTag="CONJ"

tidy.config

add-xml-decl: true
drop-empty-paras: false
fix-backslash: false
fix-bad-comments: false
fix-uri: false
input-xml: true
join-styles: false
literal-attributes: true
lower-literals: false
output-xml: true
preserve-entities: true
quote-ampersand: false
quote-marks: false
quote-nbsp: false

indent: auto
indent-attributes: true
indent-spaces: 4
tab-size: 4
vertical-space: true
wrap: 150

char-encoding: utf8
input-encoding: utf8
newline: CRLF
output-encoding: utf8

quiet: true


On Wed, Feb 15, 2023 at 3:06 AM Christian Grün <christian.gruen@gmail.com> wrote:
Hi Patrick 

I noticed that the attributes for the wg element had not been aligned, so I was wondering if you were thinking of a more advanced rule.

Or would you possibly like to supply the names of the elements for which the alignment should take place?

Best,
Christian



Patrick Durusau <patrick@durusau.net> schrieb am Mi., 15. Feb. 2023, 03:51:
Christian,

Ah, no, it isn't a length of element name + attribute but the ability to
align attributes for an element as you see in my post for the <w
element. Each key/value is followed by a line return.

In the mean time, the current version of tidy has been added to the
workflow to produce the desired results.

But it would be great to have it native to BaseX!

Thanks!

Patrick

On 2/14/23 01:30, Christian Grün wrote:
> Hi Patrick,
>
> There’s currently no serialization parameter to control the custom
> indentation of attributes.
>
> If I get you correctly, you’d like to get attributes indented if the
> string length of the element name and the attributes exceed a specific
> maximum length?
>
> Best,
> Christian
>
>
> On Mon, Feb 13, 2023 at 9:10 PM Patrick Durusau <patrick@durusau.net> wrote:
>> Greetings!
>>
>> I've been tasked with using BaseX to produce:
>>
>> *****
>>
>>            <wg class="cl" rule="S-IO" cltype="VerbElided">
>>               <wg rule="NpaNp" role="s">
>>                  <wg type="group" appositioncontainer="true" rule="Np-Appos">
>>                     <w ref="PHM 1:1!1"
>>                        after=" "
>>                        class="noun"
>>                        gbiType="proper"
>>                        xml:id="n57001001001"
>>                        lemma="Παῦλος"
>>                        normalized="Παῦλος"
>>                        strong="3972"
>>                        number="singular"
>>                        gender="masculine"
>>                        case="nominative"
>>                        gloss="Paul"
>>                        domain="093001"
>>                        ln="93.294a"
>>                        morph="N-NSM"
>>                        unicode="Παῦλος">Παῦλος</w>
>>
>> *****
>>
>> The indenting is easy enough and I can even make it deeper if required
>> but is there a command for serialization that will properly format the
>> attributes?
>>
>> My personal suspicion is that inserting \n when each attribute is
>> serialized (and not on the last one) is the easier route but I promised
>> to investigate the command line.
>>
>> Have I overlooked something in the very fine manual?
>>
>> Hope everyone is having a great week!
>>
>> Patrick
>>
>> --
>> Patrick Durusau
>> patrick@durusau.net
>> Technical Advisory Board, OASIS (TAB)
>> Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
>> Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
>>
>> Another Word For It (blog): http://tm.durusau.net
>> Homepage: http://www.durusau.net
>> Twitter: patrickDurusau
>>
--
Patrick Durusau
patrick@durusau.net
Technical Advisory Board, OASIS (TAB)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)

Another Word For It (blog): http://tm.durusau.net
Homepage: http://www.durusau.net
Twitter: patrickDurusau