r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[XML] xml related conversations

Graham
22-Jun-2009
[608]
Now has anyone written a recursive routine to turn a rebol object 
into XML?  I couldn't find anything like this on rebol.org yet it 
doesn't sound hard to do ...
Gregg
22-Jun-2009
[609]
There must be something, but I don't have anything here that turned 
up, and I don't remember doing one myself. If it helps, you could 
use the JSON converter in %json.r as a starting point.
Graham
22-Jun-2009
[610x5]
This seems to work for me ...

obj2xml: func [ obj [object!] out [string!]
	/local o 
][
	foreach element next first obj [
		repend out [ to-tag element newline ]
		either object? o: get in obj element [
			obj2xml o out
		][
			repend out [ o newline ]
		]		
		repend out [ to-tag join "/" element newline ]
	]
]
using this 

obj2xml: func [ obj [object!] out [string!]
	/local o 
][
	foreach element next first obj [
		repend out [ to-tag element newline ]
		either object? o: get in obj element [
			obj2xml o out
		][
			repend out [ o newline ]
		]		
		repend out [ to-tag join "/" element newline ]
	]
]
crap ... clipboard bug
>> probe obj
make object! [
    a: "testing"
    b: "again"
    c: make object! [
        d: "testing2"
        e: "again2"
        f: make object! [
            g: "testing3"
            h: "again3"
        ]
    ]
    i: "finished"
]
gives this 

<a>
    testing
</a>
<b>
    again
</b>
<c>
    <d>
        testing2
    </d>
    <e>
        again2
    </e>
    <f>
        <g>
            testing3
        </g>
        <h>
            again3
        </h>
    </f>
</c>
<i>
    finished
</i>
Steeve
22-Jun-2009
[615]
Hmm..
Really, have you the tabulations ?
Graham
22-Jun-2009
[616x2]
Yes, separate script does the tabulations
probably should change line 
repend out [ o newline ]
to 
repend out [ any [ o copy "" ] newline ]
Steeve
22-Jun-2009
[618]
then, for NONE! values, it will  add an empty line
Graham
22-Jun-2009
[619x3]
format-xml: func [ xml
    /local out space prev
][
    out: copy ""
    spacer: copy ""
    prev: copy </tag>
    foreach tag load/markup xml [
        either tag = find tag "/" [
            ; we have a close tag
            

            ; reduce the spacer by a tab unless the previous was an open tag
            either not tag? prev [
                ; not a tag
                remove/part spacer 4
            ][
                ; is a tag
                if prev = find prev "/" [
                    ; last was a closing tag
                    remove/part spacer 4
                ]
            ]
        ][ 
            either tag? tag [
                ; current is tag
                ; indent only if the prev is not a closing tag
                if not prev = find prev "/" [
                    insert/dup spacer " " 4
                ]
            ][
                ; is data
                insert/dup spacer " " 4 
            ]
        ]
        repend out rejoin [ spacer tag newline ]
        prev: copy tag
    ]
	view layout compose [ area (out) 400x400 ]
]

obj2xml: func [ obj [object!] out [string!]
	/local o 
][
	foreach element next first obj [
		repend out [ to-tag element ]
		either object? o: get in obj element [
			obj2xml o out
		][
			repend out any [ o copy "" ]
		]		
		repend out [ to-tag join "/" element ]
	]
]
remove the newlines to solve that issue :)
I was using rebelxml to construct xml ... but I came across some 
bugs.  So this way of doing it looks easier ....
Graham
23-Jun-2009
[622]
What are people using to construct large XML documents ... of 100s 
of lines?
Maxim
23-Jun-2009
[623x4]
a modified version of John's rebXML  tools.  changed the output structure 
to allow rebol's path notation to be used to traverse the loaded 
xml.
I also replaced the use of url for the tag words because they fail 
when using namespaced xml elements.
building output objects instead would be simple, but the RAM/Speed/symbol 
table implications of all the binding involved makes this un-optimal.
(...objects instead of  blocks ....)
Graham
23-Jun-2009
[627]
So, you used blocks instead of objects?
Maxim
23-Jun-2009
[628x2]
yes all the time.  accessing is exactly the same as for objects. 
 its actually much more flexible.
cause you have have the same element severall times, which is valid 
xml, but invalid in contexts.
Graham
23-Jun-2009
[630]
True
Maxim
23-Jun-2009
[631]
and you can easily separate attributes from elements, just by affecting 
them to different types.
Graham
23-Jun-2009
[632x3]
Although the XML I'm dealing with doesn't have duplicate elements.
Or name spaces
Have you posted your modifications anywhere??
Maxim
23-Jun-2009
[635x3]
the most stable engine I built which accepted all xml possibilities 
ended loading xml like so:


[ <element> [<subelement> [#attribute  "attr-value" . "subelement 
content"]]]
the .  is assigned the value of the elements. 

the above would result from the following XML:
<element>
	<subelement attribute="attr-value">
		subelement content
	</subelement>
</element>
And you can access it this way:

document/<element>/<subelement>/#attribute
document/<element>/<subelement>/.
Graham
23-Jun-2009
[638x2]
I find working with objects much easier though ...
I guess the duplicate elements could be solved by using blocks for 
them
Maxim
23-Jun-2009
[640]
I've never posted that specific version cause it was closed source 
for a client.   but I have my own new engine, which does the same, 
but attacking the parse rules directly... its probably faster.
I've not released it.
BrianH
23-Jun-2009
[641]
Really? I went positional: ["element" "namespace" ["attribute" "value"] 
["subelement" ...] "text" ...]

with missing namespace or attribute block being #[none], so defaults 
can be done with ANY.
Graham
23-Jun-2009
[642]
Using tags looks ugly :)
Maxim
23-Jun-2009
[643]
note that in the above, you can replace types within so it could 
be words instead of tags.
BrianH
23-Jun-2009
[644]
My positional version handles multiple subelements of the same type, 
and using strings rather than words lets you use tags that don't 
match word syntax or are case=sensitive.
Maxim
23-Jun-2009
[645x2]
its just that in my tests, either you can create, read or set some 
of the datatypes via path notation.  so only string based types allow 
full XML qualification.
hahaha
Graham
23-Jun-2009
[647]
XML can be case sensitive??
Maxim
23-Jun-2009
[648]
but brian, how do you acess it?
Graham
23-Jun-2009
[649]
by position!
BrianH
23-Jun-2009
[650x2]
XML *is* case-sensitive. Your paths can't access multiple subelements 
of the same type, or embedded text.
I wrote a simple xpath compiler too (but don't know where it is now).
Maxim
23-Jun-2009
[652]
I wanted direct access to all elements within rebol.
Graham
23-Jun-2009
[653]
Looks like we need an article on best practices here ...
Maxim
23-Jun-2009
[654]
a later version, using schema validation, understands multiple subelements 
and automatically converts them to blocks IIRC.

so you do document/element/3/subelement/#attribute.
BrianH
23-Jun-2009
[655]
Your paths can't access multiple subelements of the same type, or 
embedded text. It might have worked for that customer but not the 
general case. No namespace support either.
Maxim
23-Jun-2009
[656]
my paths.. namespace works... for sure.  did you know you can have 
colon in word names in R2 !   but i didn;t use that, I just used 
tags directly.  more obvious than strings, and the exact same effort 
and speed.
BrianH
23-Jun-2009
[657]
I was parsing xhtml and other XML of the like. Subelements of mixed 
types in order with text between them than mattered.