World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Maxim 12-Dec-2009 [4701] | unfortunately what you say isn't feasible, even if you can technically do it. who is going to program a parser to colorise code which is usefull for only one application? its actually going to take more time to write your color parser for each piece of code than write the code itself :-P so bottom line, Graham doesn't like this syntax. any others care to comment? |
Graham 12-Dec-2009 [4702] | Max, just do what ever suits you. |
Maxim 12-Dec-2009 [4703] | I'm just trying to get a feel for what others think about the idea. and sharing a bit of a discovery at the same time, if it may help others. the goal isn't to be popular or convince others... and sorry, if my last line may have looked harsh, it wasn't. :-) I was just resuming your reaction plainly and relaunching the question to be sure others realize I want a few opinions. |
Graham 12-Dec-2009 [4704] | it's not a syntax but a convention ... |
Maxim 12-Dec-2009 [4705] | true :-) |
PeterWood 12-Dec-2009 [4706] | any others care to comment? I'm afraid t looks very messy to me and reminded me of Perl for some reasion. |
Maxim 12-Dec-2009 [4707x2] | yay, I've got the BNF grammar done... its ripping through a C language BNF grammar definition... :-) now I've just got to make a parse rule emitter ... easy enough. |
(all in R3, but not using newer parse stuff, cause its not required) | |
Maxim 13-Dec-2009 [4709] | the new parse rejection system is VERY cool. ( can simplify the structure of some rules a lot :-) |
Gregg 13-Dec-2009 [4710] | For a long time I've added = to the end of my parse rules, and = to the beginning of parse variables. I think it matches the production rule grammar well, and also emulates set-word/get-word syntax. |
Maxim 13-Dec-2009 [4711x3] | I'll try that, its a good variant, even better since then we clearly identify the 3 different parse constructs separately. |
I've used word= for other things before and I liked it. | |
finished the rewrite of the BNF parser... funny... there is more documentation & comments than code. | |
Maxim 14-Dec-2009 [4714] | one strange thing I realised is that most people who write bnf, will write them in exactly the opposite of what parse needs to be.. they'll but the smallest pattern first. so that if applied in parse directly, it always short-circuits the other rules following it. |
Gregg 14-Dec-2009 [4715] | Yup. Different mindset. I just looked at your BNF compiler earlier. Good stuff. I did an ABNF-to-parse generator some time back. ABNF is used in a lot of IETF RFCs and such. |
Maxim 14-Dec-2009 [4716x2] | what is the difference? |
is ABNF == EBNF ? | |
Gregg 14-Dec-2009 [4718] | There are a lot of differences, unfortunately. It's not terrible, just different. It's not EBNF. http://en.wikipedia.org/wiki/Augmented_Backus%E2%80%93Naur_Form |
Maxim 14-Dec-2009 [4719] | that is nice, is your ABNF parser still accessiblel somewhere? it could improve the quatily and ease of integrating the protocols to R3 IMO. ABNF also seems much more aligned to parse |
Gregg 14-Dec-2009 [4720] | Generating PARSE rules wasn't too hard. It is a nice fit. Same issue with existing grammars though, in that you have to fix some things up manually, or we have to make the generator smarter. I'll zap you what I have. Can't remember where I've posted it elsewhere. |
Maxim 14-Dec-2009 [4721] | sure. |
Maxim 15-Dec-2009 [4722] | I've been rewriting bnf generated parse rules (and often a bit cryptically) into proper parse ordered rules for 3 days now... <sigh> C is sooo complex for what it really does. I''ve discovered a few quite mind-boggling language capabilities... stuff like: char *( *(*var)() )[10]; it takes 7 steps to define what that really is and there are other "fun" examples which end up being interpretation nightmares, but look really simple. one thing is certain at this point... although I will be able to build a C to rebol converter with relative precision under specific goals, some of the crazy stuff just will have to be finished manually by humans. at least I rarely see such twisted C code in most of what I've been reading so far. |
BrianH 16-Dec-2009 [4723x3] | BNF is just a syntax form, with a *lot* of variation. The real difference that matters between Yacc and PARSE is the parsing model. Yacc implements an LR parser (or some variant thereof), and PARSE implements a variant of TDPL parsing (related to PEG), though more powerful and with a completely different syntax. How you structure the parse rules depends on the parsing model, not the syntax. For instance, LR parsers tend to do recursion rather than iteration, and when they recurse the recrsive call tends to be on the left, with the distinguishing clause on the right. For PEG parsers, recursion goes the other way. This is not an error, this is a difference in parsing model. If you are translating from Yacc to PARSE, it's not just a syntax change. You have to reorganize the rules to match the new model. And watch out: Certain patterns are easier to express in some parsing models than in others. Some patterns aren't supported at all in some models, and so no amount of translation will help you. We chose the TDPL model for PARSE because it is more expressive than the LR model, so in theory you should be able to translate LR rules to PARSE with some topological twists (redoing the sturcture of the rules). However, there are patterns that you can express in PARSE that can't be translated to LR, even with topological changes. |
Unfortunately, the C grammar was designed with LR parsers in mind. | |
You might be better off translating a C grammar for a PEG or TDPL parser generator into PARSE - less topological shifts needed. | |
Maxim 16-Dec-2009 [4726] | well, considering that I just finished the basic rule re-organisation... eheheh I think I'll apply the unit testing phase right now to test if all the rules perform as they shoudl using input text. there is probably going to be about 100kb of unit test code for what is now about 12kb of parse rules. |
BrianH 16-Dec-2009 [4727x2] | Sounds about right. |
Are you sure you have enough test code/data? | |
Maxim 16-Dec-2009 [4729x2] | there is all in all only two or three rules that I'm unsure of the transformation, as some aspects of the C syntax are a bit obscure to represent. |
you are being sarcastic right? :-) | |
BrianH 16-Dec-2009 [4731x2] | No, really. The syntax of C is so complex that you would need a lot of data to test all of the common variations. |
data in this case being C source. | |
Maxim 16-Dec-2009 [4733] | my goal is to get the host code and OpenGL headers past the parsing phase. once that is done, I'll start work on adding the production phase. I still have to write the pre-processor, but that in fact is pretty straight forward. there are little rules and they are much more static and well defined on the MS web site. |
BrianH 16-Dec-2009 [4734] | Well, good luck! :) |
Maxim 16-Dec-2009 [4735] | the funny thing is that the C language reference on the MSDN is actually pretty well done... there are a lot of evil C examples for some of the more obscure parts of the language like pointers, structs and unions. funny thing is that some of the most complex things to express where the litteral constants! integers, with octal, hex notation... not as simple as some [digits] ;-) |
Gabriele 16-Dec-2009 [4736] | Maxim, maybe you thought I was kidding the other day... ;) |
Maxim 16-Dec-2009 [4737] | hehe |
Henrik 24-Dec-2009 [4738] | Looking at the new WHILE keyword and I was quite baffled by Carl's use of it in his latest blog example. Then I read the docs and it didn't get much better: - WHILE is a variant of ANY - ANY stops, if input does not change - WHILE doesn't stop, even if input does not change What does "input does not change" mean? Is it about changing the parse series length during parse? Is it actively moving the parse index back or forth using special commands? Is it normal progression of parse index with each cycle of WHILE or ANY? Is it alteration of the parse series content while maintaining length during parse? |
Pekr 24-Dec-2009 [4739x4] | Henrik - according to docs explanation, 'parse contains some internal protection for the case, when input stream does not advance its position. In R2, following code causes infinite loop, in R3, it returns false: parse str [some [to "abc"]] (I am not sure I like that it returns false - normally I expect it to cause infinite loop. This is imo overprotecting programmer, and you have to think, why your code returns false anyway, which for me is the same, as if it would cause an infinite loop) Further from docs: To avoid infinite looping, a special internal rule is triggered based on the fact that the rule did not change the input position. However, this shows a problem with this rule: parse str [some [to "a" remove thru "b"]] Here the input did not appear to advance, but something useful happened. In such cases, the some word should not be used, and the while word is better: parse str [while [to "a" remove thru "b"]] |
I don't probably understand usefullness of 'while at all. Because now I have to think, if my code would cause infinite loop, or not, and use 'some or 'while accordingly ... | |
Running above examples, my opinion is, that in fact adding 'while was probably not a good decision. I can understand, that now we have more power - our code will not easily cause an infinite loops, but otoh you now have to think, if it can happen or not, and 'some becomes your enemy ... | |
I probably need more examples .. | |
Ladislav 25-Dec-2009 [4743x3] | The WHILE keyword is the simplest possible cycle. The rule: a: [while b] is equivalent to recursive: a: [b a] |
sorry, I meant a: [b a |] | |
More complicated rules can be easily simulated using the While keyword, the opposite isn't true. Carl's example just proves, why While is useful. | |
Fork 28-Dec-2009 [4746x3] | >> parse [1 2 3] [?? thru [integer! string!] ?? integer!] thte: [1 2 3] integer!: [2 3] == false |
What's that "thte" thing? | |
?? not initialized after first match? And secondly, how do I match thru a series of things (e.g. integer! integer!, but just wondering about the thte. ?? problem before the first match?) | |
Pekr 28-Dec-2009 [4749x2] | >> parse [1 2 3][?? thru [integer! string!] ?? integer!] thru: [1 2 3] integer!: [2 3] == false |
what do you mean by "match thru a series of things"? | |
older newer | first last |