World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Geomol 1-May-2011 [5769x2] | I think, I'll make those errors in my function version of parse. Almost done (in the first version similar to R2 version) |
Kinda same problem, when combining them this way (still in R2): >> parse [] [thru opt 'a] == false >> parse [a] [thru opt 'a] == false | |
BrianH 1-May-2011 [5771] | In R3 that would be an improperly untriggered error, since TO and THRU are defined to not take the full gamut of rules, only a subset. Probably the same for R2, but a different subset. |
Geomol 1-May-2011 [5772x3] | Ok, version 1.0.0 of BPARSE is found here: http://www.fys.ku.dk/~niclasen/rebol/libs/bparse.r It's a function version of PARSE, and can only parse blocks for now, not strings. It can do more or less all PARSE in R2 can do when parsing blocks. I've tried to trigger errors, R2 PARSE doesn't. The purpose is to play around with parsing to maybe make a better version than the native version and without bugs. |
It's not as fast as the timings, I gave here earlier with a very early version. | |
I've thought some more about [thru end], which return false in the R2 version, but return true in R3. My version return false as R2, but I better understand the R3 way, now I've programmed it. It can be seen as, how THRU should be understood (, as also Ladislav said something about)? Do we think of [thru rule] as [to rule rule] or [to rule skip] ? If the TO keyword can handle complex rules like: parse [a b] [to ['a 'b] ['a 'b]] then the first might make better sense, and [thru end] should return true. But we can't parse like that in R2, so maybe we think more of it as the second, and then [thru end] should return false. But if you look in my version, I have to catch this special case, where END follows THRU, so it takes more code, which isn't good. In any case, Ladislav's suggestion to use [end skip] as a fail rule is much better. If you're not at the end, the first word (end) will give false, else the next will fail, as you can't skip past end. | |
BrianH 1-May-2011 [5775] | END is a zero-length repeatable rule, like NONE, so TO END and THRU END should be equivalent. |
Geomol 1-May-2011 [5776] | Makes sense, it's just hard to grasp, when used to how R2 parse works. |
BrianH 1-May-2011 [5777] | I'd consider that an error in R2's PARSE, but not a fixable one because it would change the semantics. |
Geomol 1-May-2011 [5778] | Now you have your own function version of parse, that you can make work exactly as you wish. :-) And then maybe, when you're satisfied, give it to Carl. It should now also be easier to make C versions of parse for those, who make alternatives to REBOL. At least you have a REBOL function to start with. |
Maxim 1-May-2011 [5779] | did you try it with complex rules? |
Geomol 1-May-2011 [5780] | Yes, parsing a dialect I have to produce PDF output. |
Maxim 1-May-2011 [5781] | wow, cool. |
Geomol 1-May-2011 [5782] | Tried it on rebps2pdf.r and the example found here: http://www.fys.ku.dk/~niclasen/postscript/ |
BrianH 1-May-2011 [5783] | Having an R2-compatible PARSE that you can run in R3 would be useful for large sets of parse rules that you haven't had the time to migrate yet. |
Geomol 1-May-2011 [5784] | Ah yes, good idea. Haven't thought about that yet. |
Maxim 1-May-2011 [5785] | did you start work on a string parser? |
Geomol 1-May-2011 [5786] | nope |
Maxim 1-May-2011 [5787] | oki. |
Geomol 1-May-2011 [5788] | Some day probably. Let's see, how it goes with bparse first. |
BrianH 1-May-2011 [5789] | It would also be useful to have an R3-compatible PARSE for R2. And both for Red. |
Maxim 1-May-2011 [5790x2] | bah, I'd just stick with R3 parsing for Red. it'll be a good incentive for some to upgrade. |
(to red or R3 parse, depending on how you see "upgrade" ;-) | |
Geomol 1-May-2011 [5792] | I think about downgrading. :-) You know, keep it simple. Like dropping SKIP as it's the same as any-type! etc. If I want SKIP, I can just define it then: skip: :any-type! |
Maxim 1-May-2011 [5793] | I'd drop any-type! :-) |
Geomol 1-May-2011 [5794] | Having skip as a keyword mean, you can't use that word as a variable. |
BrianH 1-May-2011 [5795] | That doesn't work with string parsing. |
Geomol 1-May-2011 [5796] | ok |
BrianH 1-May-2011 [5797] | Most people tend to not use 'skip as a variable anyways, because of the SKIP function. |
Geomol 1-May-2011 [5798] | I in general very much like the idea, that many rebol functions can take different datatypes and work anyway. But I was thinking, if parsing blocks and parsing strings is so different, that it should be two functions? |
Maxim 1-May-2011 [5799x2] | and I always prefix my rules to have them stand out from keywords. |
nah, it would just use up another word. there is no ambiguity in the case of parse, as lets say ADD. where the same datatype may mean two things. | |
BrianH 1-May-2011 [5801x2] | For the mezzanine version, two functions might be better, though they can share code in the same module. Maybe just have one exported word for a dispatch function though. |
(or the context equivalent of modules for R2) | |
Geomol 1-May-2011 [5803x2] | yes |
When programming it, I also wondered, why the or keyword is | and not OR. Do you know the reason? | |
BrianH 1-May-2011 [5805] | Parsing tradition. And it's not really OR, it's backtracking alternation. |
Geomol 1-May-2011 [5806] | Right, just wondered, now rebol call e.g. floats for decimals etc. many attempts to make the language more humane. |
BrianH 1-May-2011 [5807] | Considering that the space character is the closest thing to AND if | is OR, we should consider ourselves to have gotten off lucky :) |
Geomol 1-May-2011 [5808] | parse [a b c] ['aAND'bAND'cEND] hmm, yeah, you've got a point. |
BrianH 1-May-2011 [5809] | We used up that luck though when we called the lookahead-match operation AND, and the lookahead-non-match operation NOT. |
Geomol 1-May-2011 [5810] | & and ! maybe? |
BrianH 1-May-2011 [5811] | We're probably fine with the wording we got. Though strangely enough, | is the ELSE of the IF operation. ELSE is a more descriptive name for | than OR in general. |
Ladislav 1-May-2011 [5812] | Geomol: [to rule skip] does not mean the same as [thru rule] , as can be demonstrated when comparing the behaviour of thru rule for rule = "abc" It is quite a surprise for me, that you don't see the difference. |
Geomol 2-May-2011 [5813] | In R2 parsing a block: >> parse ["abc"] [to "abc" skip] == true >> parse ["abc"] [thru "abc"] == true I know, it's different when parsing a string instead of a block. My comparison of [thru rule] to the alternatives was meant as a loose comparison, not to be taken literally. So it's easy to think of THRU to work this way, because it does in many cases, therefore the confusion. |
Ladislav 2-May-2011 [5814] | because it does in many cases - should rather be "because THRU is so limited, that it is unable to handle many cases" |
Geomol 2-May-2011 [5815] | yeah :) |
Ladislav 2-May-2011 [5816] | But, the recursive description: a: [b | skip a] is quite natural. |
Geomol 2-May-2011 [5817] | Yes, and that should work in all cases, if the b rule is found, complex or not. And this will return true, if b is END, because END is a repeatable rule (you can't go past it with SKIP). NONE is also repeatable, and if you look in the code, I have to take care of this too separately. This mean, we can't parse none of datatype none! by using the NONE keyword, but we can using a datatype: >> parse reduce [none] [none] == false >> parse reduce [none] [none!] == true So it raises the question, if the NONE keyword should be there? What are the consequences, if we drop NONE as a keyword? And are there other repeatable rules beside END and NONE? In R2 or R3. |
Ladislav 2-May-2011 [5818] | The "empty string rule" (represented by the NONE keyword in REBOL) is absolutely necessary to have. All other members of the Top Down Parsing Language family have it as well. |
older newer | first last |