World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Pekr 5-Dec-2006 [1551] | Just asking, because today I read a bit about ODF and OpenXML (two document formats for office apps). There is probably open space for small apps, parsing some info from inside the documents etc. (meta-data programming) ... just curious ... or will it be better to wait for full-spec XML MLs libs, doing the job given, and link to those libraries? |
BrianH 5-Dec-2006 [1552] | Such a thing has been on my todo list for a while, but I've been a little busy lately with non-REBOL projects :( |
Gregg 5-Dec-2006 [1553] | I don't want to deal with XML beyond simple well-formed XML, too complex. I don't, personally, have any interest in doing generic XML toolkit stuff at this point. I can see value in it for some people, but I'd rather write REBOL dialects. :-) |
Maxim 8-Dec-2006 [1554x2] | geomol's xml2rebxml handles XML pretty well. one might want to change the parse rules a little to adapt the output, but it actually loads all the xml tags, empty tags and attributes. it even handles utf-8, CDATA chunks, and converts some of the & chars. |
I am using an adapted form of it commercially so far. I have implemented full schema validation and loading (in rebol) but its proprietary code I can't release. So guys, it can be done ! | |
Allen 10-Dec-2006 [1556] | I'm starting to see some abandonment of XML in favour of JSON .. mainly in web 2.0 . but it will not replace xml where validation is required. |
BrianH 11-Dec-2006 [1557] | You really have to trust your source when using JSON to a browser though. Standard usage is to load with eval - only safe to use on https sites because of script injection. |
[unknown: 9] 11-Dec-2006 [1558] | XML and JSON sucks... |
Maxim 11-Dec-2006 [1559] | is there a way to make block parsing case sensitive? this doesn't seem to work: parse/case [A a] [some ['A (print "upper") | 'a (print "lower")]] |
Gabriele 11-Dec-2006 [1560x2] | words are not case sensitive. |
>> strict-equal? 'A 'a == true | |
Maxim 11-Dec-2006 [1562x3] | I was just hoping case could have been an exception... it would be very usefull especially when parsing code from other languages... |
(I meant using /case within parse) | |
well, seems like I'll be doing string parsing then :-) | |
Gabriele 11-Dec-2006 [1565x3] | you could take advantage of this bug: |
>> alias 'a "aa" == aa >> strict-equal? 'A 'a == false | |
but it will be fixed eventually :P | |
Maxim 11-Dec-2006 [1568x2] | hehe... I would not want the bug to get too comfortable, less it becomes a feature ;-) |
you know what they say... "features are bugs with experience" | |
Josh 11-Dec-2006 [1570x2] | I don't know |
Whoops | |
Joe 24-Dec-2006 [1572x4] | s: "str" s2: "str 1^/ str 2 ^/ str 3" rules: [ any [ end break | copy value [to "^/" | to end] (print value) ] ] parse s rules print "---" parse s2 rules |
i run the above on core 2.6 and it loops forever . This was a bug fixed in 2.3 but it looks like the bug still exists | |
sorry, not a bug. I was inspired by the example in the changes page and it is missing the thru "^/" after the to "^/" | |
parse item [ any [ "word" (print "got word") | copy value [to "abc" | to end] (print value) break ] ] | |
Gabriele 25-Dec-2006 [1576x2] | not a bug - you are not skipping the newline, so to "^/" will always match. you are not getting to the end. |
>> rules: [ [ any [ [ end break [ | [ copy value [to newline | to end] (print value) opt skip [ ] [ ] == [ any [ end break | copy value [to newline | to end] (print value) opt skip ] ] >> parse s2 rules str 1 str 2 str 3 == true | |
Joe 25-Dec-2006 [1578x2] | yes, thanks gabriele - happy holidays ! i find the opt skip not very intuitive ! |
wouldn't to newline thru newline be easier to understand than opt skip | |
Volker 25-Dec-2006 [1580] | could be opt newline |
Gabriele 26-Dec-2006 [1581x5] | joe, if you don't care about parse returning true you can just use skip (without opt, which is there for the end case) |
also, if you don't care about your value having the newline in it, you can just replace to newline with thru newline. | |
another possibility (but gives more maineinance problems) is to split the copy rule into two, one for newline and one for end. | |
copy value to newline skip | |
copy value to end | |
Ladislav 27-Dec-2006 [1586x2] | Joe: another option is to use: rules: [ any [ copy value [to newline | to end] (print value) skip ] to end ] |
but, as Gabriele said, that is equivalent to: rules: [ any [ copy value [to newline | to end] (print value) skip ] ] if you ignore the parse result | |
BrianH 27-Dec-2006 [1588x2] | to end skip will always fail. move the skip after the to newline. |
Nevermind, failing isn't a problem here. | |
Ladislav 28-Dec-2006 [1590] | another possibility (but gives more maineinance problems) is to split the copy rule into two, one for newline and one for end. - I am curious whether this isn't actually better when the maintenance is taken into account - suppose e.g. that we want to add yet another alternative... |
Gabriele 28-Dec-2006 [1591x2] | lad, maybe, but if you change the name of the variable to copy to you have then to change it twice in the rule. |
generally, i'd prefer [copy value [rule1 | rule2]] to [copy value rule1 | copy value rule2], however it is not always that easy, so many times you have to do the latter. | |
Anton 28-Dec-2006 [1593] | I agree. |
Maxim 28-Dec-2006 [1594x2] | hi, yesterday I realized I have a 1400 line single parse ruleset which amounts to ~40kb of code ! :-) I was wondering what are your largest Parse rulesets, I'm just curious at how people here are pushing REBOL into extremes. I might also say that parse is wildly efficient in this case, allowing my server to decipher 600 bytes of binary data through all of this huge parse rule within 0.01 - 0.02 seconds (spitting out a nested rebol block tree of rebxml2xml ready data). |
to anyone not yet accustomed to 'PARSE, really do take the time to look it through and use it. | |
Pekr 28-Dec-2006 [1596] | when working so extensively with Parse, you migt try to write down your enhancement/fixes proposal and submit it to R3 team :-) |
Maxim 28-Dec-2006 [1597x4] | The one real limitation I always saw which makes some rules harder to write for nothing is the first-of-any search (to and through) |
and THE MOST PROFOUND limitiation... why the hell is 'NOT within the dialect? | |
50% of the time its easier to match something which is not and then within that choice select something which is. | |
like bounds checking, making sure some items are not within a specific area, etc. | |
older newer | first last |