r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[XML] xml related conversations

Robert
2-Jan-2010
[859]
Wouldn't it make a lot more sense to use a C based XML parser, construct 
a Rebol data-structure/string and return that to Rebol?
Geomol
2-Jan-2010
[860]
Janko, rebxml is a rebol version of xml. It can do the same things, 
but without the bad implementation, xml suffers from. The idea behind 
xml is ok, it's just not implemented well. Much of that is solved 
with the rebxml format.
Gregg
2-Jan-2010
[861]
I believe Maarten has done a SAX style parser.  I've used parse-xml 
in the past, sometimes post-processing the output to a different 
REBOL form, but my needs were simple.


Janko, have you tested any of the existing soluitions, with test 
input on target hardware, and found them to be too slow? If so, what 
were the results, and how fast do you need it to be?
BrianH
2-Jan-2010
[862]
SAX pull parsing would work well with the port model.
Janko
3-Jan-2010
[863x2]
Robert: it's a good idea but not for my case. I don't want the data 
strucure from whole xml , I want to stream it through parser and 
collect out the data. 

Geomol: I will look at it but probably not what I want in this particular 
case for the reason above

Gregg: I haven't tested any yet, I googled and found that xml-parse.r 
above , which has sax style of work but seems huge. I only care to 
support the simplified subset of xml, xml with all the variants is 
a total bloat so I believe it can be that complex (and it doesn't 
support 100% of it also).  Thats why I am considering writing a simple 
sax liek parser, I wrote it in c once and it was small (but it parsed 
even smaller subset of xml)
BrianH: What does that mean "port model"?
BrianH
3-Jan-2010
[865]
The semantic model of REBOL protocol schemes, implemented with the 
port! type, would fix well with the semantic model of SAX pull. SAX 
pull generates the same SAX events, except they are not propagated 
through callbacks - instead they are returned from function calls. 
SAX pull is sort of like an generator (in the Icon or Python sense) 
of SAX events. That is very similar in model to the behavior of command 
ports (like database ports).
Pekr
4-Jan-2010
[866]
I like SAX model, because IIRC it allows to work on things in a "streamed" 
way, whereas DOM requires you load everything in memory? Sorry if 
I oversimpilifed it :-) IIRC Doc used such aproach in his Postgress 
SQL driver, in opposite to his mySQL one ...
Dockimbel
4-Jan-2010
[867]
It's a matter of tradeoff, if you only need fast XML document reading, 
SAX is the winner. If you need to modify the document, you need DOM 
(with or without SAX).
james_nak
11-Oct-2010
[868]
Does anyone know if there is a rebol object to xml script. I've got 
xml to rebol objects but now I want to change it back to xml. (and 
I'm lazy)
GrahamC
11-Oct-2010
[869]
Lazy evaluation is useful.. a lazy programmer not so!
Maxim
12-Oct-2010
[870]
I use blocks, although a bit slower to access, they are faster for 
big loads cause thery require less ram and do not required binding 
which is a big issue on large XML blocks.
james_nak
12-Oct-2010
[871]
Yeah, what I am trying to do is convert back to XML after I've done 
my thing.
Maxim
12-Oct-2010
[872x2]
well, going to xml is easy no?
how are your objects structured?
james_nak
12-Oct-2010
[874]
Sorry for the delay. They are nested objects that represent the tags 
they were created from. I think the answer is that I will just have 
to create the routines to do what I wanted. I thought that perhaps 
there was something already out there. Thanks.
Maxim
12-Oct-2010
[875]
might find some inspiration in the JSON converters ?
james_nak
12-Oct-2010
[876]
Yes, if I run into any problems I will look into those.
Thanks Maxim. That renote app is really cool, btw.
Maxim
12-Oct-2010
[877]
thx it will improve about once a week.
Oldes
13-Oct-2010
[878]
It depends what's your input and how should look the output, but 
you can use something like that:
context [
	xml:  copy ""
	tabs: copy ""
	set 'to-xml func[node /init][
		if init [
			xml:  copy ""
			tabs: copy ""
		]
		switch/default type?/word node [
			object! [
				append tabs #"^-"
				foreach child next first node [
					append xml rejoin [tabs "<" child ">^/"]
					to-xml node/(child)
					append xml rejoin [tabs "</" child ">^/"]
				]
				remove tabs
			]
		][
			append xml rejoin [
				tabs "<" type? node ">" node "</" type? node ">^/"
			]
		]
		xml
	]
]
o: context [
	person: context [
		name: "bla"
		age:  1
	]
]


print rejoin [
	"<o>^/"
		to-xml o
	"</o>"
]
james_nak
13-Oct-2010
[879]
Thanks Oldes.
GrahamC
13-Oct-2010
[880x3]
this is something I wrote a couple of years back ... maybe it will 
help

obj2xml: func [obj [object!] out [string!]
	/local o
] [
	foreach element next first obj [
		if all [ not function? o: get in obj element o] [
                        ; not a none tag
				repend out [to-tag element]
				either object? o [
					obj2xml o out
				] [
					repend out any [o copy ""]
				]
				repend out [to-tag join "/" element]
		]
	]
	out
]
If there's a function in the object, it drops it.
posted last year to this group!  http://www.rebol.org/aga-display-posts.r?post=r3wp323x568
james_nak
14-Oct-2010
[883]
Thanks Graham. With the way that the programs I am using create the 
objects, I'm finding that I have to create something pretty specific. 
I appreciate the thought and code though.
GrahamC
3-Nov-2010
[884]
Is John's the only rebol utility that turns a rebol representation 
back into an xml document with attributes?
Maxim
3-Nov-2010
[885]
check your PMs... (it might not hav turned red.... altme bug)
Maxim
10-Nov-2010
[886x2]
A question for XML users related to namespaces.


is it possible for a tag's attributes to originate from two different 
namespaces?

ex:
<tag  ns1:attr="data" ns2:other-attr="data">

or even worse:
<tag  ns1:attr="data" ns2:attr="data">


my gut tells me no, but I've been wrong before in this delightfull 
world of XML spec overcomplexification .
FYI, I've just discovered that yes... you can have the same attribute 
several times in a tag so long as the namespace is different.

XML is ... so ... much ... fun....


NOT!
Gregg
10-Nov-2010
[888]
I don't like XML, but it makes sense that namespaces prevent collisions.
Maxim
10-Nov-2010
[889]
yes, its just pretty complicated to manage two attributes of a tag 
which come from different namespaces... so in the end, what does 
the attribute really mean.
Gregg
10-Nov-2010
[890]
Well, if the words has bindings...oh wait, wrong language. ;-)
Maxim
10-Nov-2010
[891]
hehe, yes it is similar to advanced word usage in REBOL.  but xml 
isn't really a language in my understanding (interpretation) of the 
word.
Steeve
10-Nov-2010
[892]
just a dialect with a bad messy syntax...
Oldes
13-Nov-2010
[893x3]
I just created this function to convert the data tree returned from 
REBOL's default parse-xml function back to the same string:
context [
	out: copy ""
	emitxml: func[dom][
		foreach node dom [
			either string? node [
				out: insert out node
			][
				foreach [ name atts content ] node [
					out: insert out join {<} [name #" "]
					if atts [
						foreach [att val] atts [ 
							out: insert out ajoin [att {="} any [val ""] {" }]
						]
					]
					out: remove back out
					
					either all [content not empty? content] [
						out: insert out #">"
						emitxml content
						out: insert out ajoin ["</" name #">"]
					][
						out: insert out "/>"
					]
				]
			]
		]
	]
	set 'xmltree-to-str func[dom][
		clear head out
		emitxml dom
		head out
	]
]
>> xmltree-to-str third parse-xml {<test arg="1"><bla/>hello</test>}
== {<test arg="1"><bla/>hello</test>}
I'm not sure if the naming is correct, but I don't care, I need the 
functionality.
GrahamC
13-Nov-2010
[896]
Does it handle name spaces?
Oldes
14-Nov-2010
[897x5]
do you mean this?:

>> print xmltree-to-str third parse-xml {<h:table xmlns:h="http://www.w3.org/TR/html4/">
{      <h:tr>
{        <h:td>Apples</h:td>
{        <h:td>Bananas</h:td>
{      </h:tr>
{    </h:table>}
<h:table xmlns:h="http://www.w3.org/TR/html4/">
  <h:tr>
    <h:td>Apples</h:td>
    <h:td>Bananas</h:td>
  </h:tr>
</h:table>
But REBOL's default parse-xml has limitations, so better use Gavin's 
http://www.rebol.org/view-script.r?script=xml-parse.rif you must 
parse some advanced XML doc's.
Also it's probably possible to use parse instead of the recursive 
function call, but it's working for me so I will stay with this one.
Just note, that if the source xml is using CDATA, the parsed tree 
does not contain this info, so the result would be different.

>> print xmltree-to-str probe third parse-xml+ {<foo>abc <![CDATA[Jack 
& Jill]]> xyz</foo> }
[["foo" none ["abc Jack & Jill xyz"]]]
<foo>abc Jack & Jill xyz</foo>
I'm not sure how to awoid this, but fortunately my XML sources must 
be well formed without CDATA so I'm safe.
Chris
23-Nov-2010
[902]
Graham, AltXML dom objects have a 'flatten function that renders 
xml.  It preserves namespaces but not whitespace or cdata as cdata 
(though I may do a strict version that does both).
BrianH
11-Mar-2011
[903]
Finally, the work of the W3C binary XML group is an official recommendation: 
http://www.w3.org/TR/exi/
onetom
30-Apr-2011
[904x2]
just an interesing fact: the Relax NG Compact schema description 
language is a Rebol dialect if i remove the commas from it.
i tried it on the OAGiS Components.xsd which i transformed to RNG 
and the to RNC using XMLKit
Maxim
2-May-2011
[906:last]
anyone here had issues with receiving Form feed characters in XML 
(which are illegal in XML 1.0) ?