World: r3wp
[XML] xml related conversations
older newer | first last |
Graham 15-Aug-2009 [850] | Sounds like too much overhead ... unzip the docx, make changes to the xml portion and then rezip. |
Janko 2-Jan-2010 [851] | I will need a xml parser .. I was thinkinf something fast and quick like sax style .. I found this one http://www.rebol.org/view-script.r?script=xml-parse.r but by looking of it it seems to offer a lot of things I don't need. Has anyone used it for "serrious" xml parsing with it. I am thinking of making my own simple minimal event based xml parser. |
Graham 2-Jan-2010 [852x2] | Yes, I have used it to parse large XML files |
You can turn the xml file into a rebol object with it | |
Janko 2-Jan-2010 [854] | I imagine that is too costly .. I preferr the callback model to just extract the relevant data out |
Graham 2-Jan-2010 [855] | Mine is a desktop application .. your needs for a web service differ .. |
Janko 2-Jan-2010 [856] | yes, I get a big xml made by "official" BLOATED standard for invoices .. I want to parse it as quick as possible and that's all |
Geomol 2-Jan-2010 [857] | Janko, http://www.fys.ku.dk/~niclasen/rebxml/rebxml-spec.html http://www.rebol.org/view-script.r?script=xml2rebxml.r http://www.rebol.org/view-script.r?script=rebxml2xml.r |
Janko 2-Jan-2010 [858] | thanks Geomol, I will study the links .. xml2rebxml seems short which is nice, but I haven't yet figured out what exactly rebxml is .. I am reading the first link you gave me |
Robert 2-Jan-2010 [859] | Wouldn't it make a lot more sense to use a C based XML parser, construct a Rebol data-structure/string and return that to Rebol? |
Geomol 2-Jan-2010 [860] | Janko, rebxml is a rebol version of xml. It can do the same things, but without the bad implementation, xml suffers from. The idea behind xml is ok, it's just not implemented well. Much of that is solved with the rebxml format. |
Gregg 2-Jan-2010 [861] | I believe Maarten has done a SAX style parser. I've used parse-xml in the past, sometimes post-processing the output to a different REBOL form, but my needs were simple. Janko, have you tested any of the existing soluitions, with test input on target hardware, and found them to be too slow? If so, what were the results, and how fast do you need it to be? |
BrianH 2-Jan-2010 [862] | SAX pull parsing would work well with the port model. |
Janko 3-Jan-2010 [863x2] | Robert: it's a good idea but not for my case. I don't want the data strucure from whole xml , I want to stream it through parser and collect out the data. Geomol: I will look at it but probably not what I want in this particular case for the reason above Gregg: I haven't tested any yet, I googled and found that xml-parse.r above , which has sax style of work but seems huge. I only care to support the simplified subset of xml, xml with all the variants is a total bloat so I believe it can be that complex (and it doesn't support 100% of it also). Thats why I am considering writing a simple sax liek parser, I wrote it in c once and it was small (but it parsed even smaller subset of xml) |
BrianH: What does that mean "port model"? | |
BrianH 3-Jan-2010 [865] | The semantic model of REBOL protocol schemes, implemented with the port! type, would fix well with the semantic model of SAX pull. SAX pull generates the same SAX events, except they are not propagated through callbacks - instead they are returned from function calls. SAX pull is sort of like an generator (in the Icon or Python sense) of SAX events. That is very similar in model to the behavior of command ports (like database ports). |
Pekr 4-Jan-2010 [866] | I like SAX model, because IIRC it allows to work on things in a "streamed" way, whereas DOM requires you load everything in memory? Sorry if I oversimpilifed it :-) IIRC Doc used such aproach in his Postgress SQL driver, in opposite to his mySQL one ... |
Dockimbel 4-Jan-2010 [867] | It's a matter of tradeoff, if you only need fast XML document reading, SAX is the winner. If you need to modify the document, you need DOM (with or without SAX). |
james_nak 11-Oct-2010 [868] | Does anyone know if there is a rebol object to xml script. I've got xml to rebol objects but now I want to change it back to xml. (and I'm lazy) |
GrahamC 11-Oct-2010 [869] | Lazy evaluation is useful.. a lazy programmer not so! |
Maxim 12-Oct-2010 [870] | I use blocks, although a bit slower to access, they are faster for big loads cause thery require less ram and do not required binding which is a big issue on large XML blocks. |
james_nak 12-Oct-2010 [871] | Yeah, what I am trying to do is convert back to XML after I've done my thing. |
Maxim 12-Oct-2010 [872x2] | well, going to xml is easy no? |
how are your objects structured? | |
james_nak 12-Oct-2010 [874] | Sorry for the delay. They are nested objects that represent the tags they were created from. I think the answer is that I will just have to create the routines to do what I wanted. I thought that perhaps there was something already out there. Thanks. |
Maxim 12-Oct-2010 [875] | might find some inspiration in the JSON converters ? |
james_nak 12-Oct-2010 [876] | Yes, if I run into any problems I will look into those. Thanks Maxim. That renote app is really cool, btw. |
Maxim 12-Oct-2010 [877] | thx it will improve about once a week. |
Oldes 13-Oct-2010 [878] | It depends what's your input and how should look the output, but you can use something like that: context [ xml: copy "" tabs: copy "" set 'to-xml func[node /init][ if init [ xml: copy "" tabs: copy "" ] switch/default type?/word node [ object! [ append tabs #"^-" foreach child next first node [ append xml rejoin [tabs "<" child ">^/"] to-xml node/(child) append xml rejoin [tabs "</" child ">^/"] ] remove tabs ] ][ append xml rejoin [ tabs "<" type? node ">" node "</" type? node ">^/" ] ] xml ] ] o: context [ person: context [ name: "bla" age: 1 ] ] print rejoin [ "<o>^/" to-xml o "</o>" ] |
james_nak 13-Oct-2010 [879] | Thanks Oldes. |
GrahamC 13-Oct-2010 [880x3] | this is something I wrote a couple of years back ... maybe it will help obj2xml: func [obj [object!] out [string!] /local o ] [ foreach element next first obj [ if all [ not function? o: get in obj element o] [ ; not a none tag repend out [to-tag element] either object? o [ obj2xml o out ] [ repend out any [o copy ""] ] repend out [to-tag join "/" element] ] ] out ] |
If there's a function in the object, it drops it. | |
posted last year to this group! http://www.rebol.org/aga-display-posts.r?post=r3wp323x568 | |
james_nak 14-Oct-2010 [883] | Thanks Graham. With the way that the programs I am using create the objects, I'm finding that I have to create something pretty specific. I appreciate the thought and code though. |
GrahamC 3-Nov-2010 [884] | Is John's the only rebol utility that turns a rebol representation back into an xml document with attributes? |
Maxim 3-Nov-2010 [885] | check your PMs... (it might not hav turned red.... altme bug) |
Maxim 10-Nov-2010 [886x2] | A question for XML users related to namespaces. is it possible for a tag's attributes to originate from two different namespaces? ex: <tag ns1:attr="data" ns2:other-attr="data"> or even worse: <tag ns1:attr="data" ns2:attr="data"> my gut tells me no, but I've been wrong before in this delightfull world of XML spec overcomplexification . |
FYI, I've just discovered that yes... you can have the same attribute several times in a tag so long as the namespace is different. XML is ... so ... much ... fun.... NOT! | |
Gregg 10-Nov-2010 [888] | I don't like XML, but it makes sense that namespaces prevent collisions. |
Maxim 10-Nov-2010 [889] | yes, its just pretty complicated to manage two attributes of a tag which come from different namespaces... so in the end, what does the attribute really mean. |
Gregg 10-Nov-2010 [890] | Well, if the words has bindings...oh wait, wrong language. ;-) |
Maxim 10-Nov-2010 [891] | hehe, yes it is similar to advanced word usage in REBOL. but xml isn't really a language in my understanding (interpretation) of the word. |
Steeve 10-Nov-2010 [892] | just a dialect with a bad messy syntax... |
Oldes 13-Nov-2010 [893x3] | I just created this function to convert the data tree returned from REBOL's default parse-xml function back to the same string: context [ out: copy "" emitxml: func[dom][ foreach node dom [ either string? node [ out: insert out node ][ foreach [ name atts content ] node [ out: insert out join {<} [name #" "] if atts [ foreach [att val] atts [ out: insert out ajoin [att {="} any [val ""] {" }] ] ] out: remove back out either all [content not empty? content] [ out: insert out #">" emitxml content out: insert out ajoin ["</" name #">"] ][ out: insert out "/>" ] ] ] ] ] set 'xmltree-to-str func[dom][ clear head out emitxml dom head out ] ] |
>> xmltree-to-str third parse-xml {<test arg="1"><bla/>hello</test>} == {<test arg="1"><bla/>hello</test>} | |
I'm not sure if the naming is correct, but I don't care, I need the functionality. | |
GrahamC 13-Nov-2010 [896] | Does it handle name spaces? |
Oldes 14-Nov-2010 [897x3] | do you mean this?: >> print xmltree-to-str third parse-xml {<h:table xmlns:h="http://www.w3.org/TR/html4/"> { <h:tr> { <h:td>Apples</h:td> { <h:td>Bananas</h:td> { </h:tr> { </h:table>} <h:table xmlns:h="http://www.w3.org/TR/html4/"> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr> </h:table> |
But REBOL's default parse-xml has limitations, so better use Gavin's http://www.rebol.org/view-script.r?script=xml-parse.rif you must parse some advanced XML doc's. | |
Also it's probably possible to use parse instead of the recursive function call, but it's working for me so I will stay with this one. | |
older newer | first last |