World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Volker 27-Jun-2006 [1020] | Late, but got it. it would enclose "ABCDEF" but should ignore it because of the small letters.. |
BrianH 27-Jun-2006 [1021x2] | Yup. Parse is fun. |
You can drop one charset by changing [non-alpha | end] to [alpha end skip | end | none] . | |
Volker 27-Jun-2006 [1023] | would alpha break work? |
BrianH 27-Jun-2006 [1024] | No, that would break out of the enclosing all loop. The end skip will always fail and proceed to the next alternate. |
Tomc 27-Jun-2006 [1025] | capital: charset {ABCDEFGHIJKLMNOPQRSTUVWXYZ} latipac: complement capital rule: [ any latipac here: copy token some capital there: (all[ 4 < length? token insert :there "</strong>" insert :here "<strong>" there: skip :there 16 ]) :there ] parse/all/case txt [some rule] |
Volker 27-Jun-2006 [1026] | problem is "Aa", thats aword, but notan all-uppcase-word. so it should be ignored. |
Tomc 27-Jun-2006 [1027] | capital: charset {ABCDEFGHIJKLMNOPQRSTUVWXYZ} latipac: complement capital ws: charset { ^/^-} rule: [ any latipac here: copy token some capital there: opt [some ws (all[ 4 < length? token insert :there "</strong>" insert :here "<strong>" there: skip :there 16] ) ] :there ] parse/all/case txt [some rule] |
BrianH 27-Jun-2006 [1028x2] | Fails on "aA". |
The inserts are a nice touch though. | |
Tomc 28-Jun-2006 [1030] | capital: charset {ABCDEFGHIJKLMNOPQRSTUVWXYZ} ws: charset { ^/^-} latipac: difference complement capital ws sub-rule: [ some capital there: [ws | end] (all[ 4 < length? copy/part :here :there insert :there "</strong>" insert :here "<strong>" there: skip :there 17] ) ] rule: [ any latipac [ some ws here: sub-rule ]|[skip there:] :there ] parse/all/case txt [here: opt sub-rule some rule] |
BrianH 28-Jun-2006 [1031] | Doesn't take into account punctuation in the ws charset. This would fail on "HELLO, WORLD!" |
Tomc 28-Jun-2006 [1032] | left as an exercise for the reader |
BrianH 28-Jun-2006 [1033x2] | :-) |
Of course mine doesn't handle words with apostrophes or hyphens in them either. Easy fix though, just add ' and - to the capitals charset. | |
Graham 28-Jun-2006 [1035] | Actually my further spec for this requires the parser to detect spaces between capitalised words :) |
BrianH 28-Jun-2006 [1036] | And do what? |
Graham 28-Jun-2006 [1037] | treat the two capitalised words as one so <strong>HELLO DOLLY</strong> |
BrianH 28-Jun-2006 [1038] | What about "HELLO, DOLLY!" or such? |
Graham 28-Jun-2006 [1039] | I think that punctuation is part of a word |
BrianH 28-Jun-2006 [1040] | For that matter, what about words in quotes? |
Graham 28-Jun-2006 [1041] | only if capitalised |
BrianH 28-Jun-2006 [1042] | So, no difference. |
Graham 28-Jun-2006 [1043x6] | I'll explain the purpose of all this. |
A person is writing a text file. It has headings which are denoted by caps, and terminating in ":". | |
But some headings are two or more words ... with the last terminating in ":" only. | |
Words inside the text, even in caps should not normally be highlighted. | |
that's the more complete spec. | |
Anyway, i have a working version now :) | |
BrianH 28-Jun-2006 [1049] | Well, I hope I helped :) |
Graham 28-Jun-2006 [1050] | Yep .. thanks all. |
Tomc 28-Jun-2006 [1051] | replace/all "</strong> <strong>" "" |
[unknown: 9] 28-Jun-2006 [1052] | What is the best description of Parse? I would like to point some people to Parse as an example of the power of Rebol |
Henrik 28-Jun-2006 [1053] | reichart, I wrote one in the wikibook, don't know if it's useful. |
[unknown: 9] 28-Jun-2006 [1054] | Since you wrote one, do you know of a better one? This is not a reflection on yours, but it is a great way to know what you considered the next best thing. |
Tomc 28-Jun-2006 [1055x2] | salvation from regular expressions |
I may have added some the the rebol wikibook | |
BrianH 29-Jun-2006 [1057x3] | To use the simpler of the CS terms: Parse is a rule-based, recursive-descent string and structure parser with backtracking. It is not a parser generator (like Lex/Yacc) or compiler (like most regex engines) - the engine follows the rules directly. Since Parse is recursive-descent it can handle patterns that regular expressions wouldn't be able to. Since Parse backtracks it can handle patterns that ordinary recursive-descent parsers can't. Basically, it puts the text and structure processing abilities of Perl 5 to shame, let alone those of the lesser regex engines. In theory, Perl 6 has caught up with REBOL, but Perl 6 only exists in theory for now. By the time it becomes actual REBOL should surpass it (especially if I have anything to say about it). |
It's pretty easy to demonstrate patterns that regular expressions can't handle. It's only somewhat difficult to demonstrate patterns that can't be handled by a recursive descent parser without backtracking or unlimited lookahead. I have never run into a pattern that can't be handled by Parse in theory - its only limits are in implementation (available memory and recursion depth). I am not qualified to describe its limits. Still, you have to be careful about how you write the rules or they will trip you up. | |
A little dry as explanations go, I suppose. You may get better luck by showing some magic parse code tricks :) | |
Volker 29-Jun-2006 [1060] | Somewhat buzzy: Its a simplified compiler-compiler. Could be used to build a java-compiler (eg such complex syntax), but its also as easy as regex for simpler things. But still readable. (less buzzy: not always that easy due to the poorer lockahead). |
BrianH 29-Jun-2006 [1061] | Volker, it's more like it can do what a compiler-compiler can do without needing to compile :) And backtracking is about the same as unlimited lookahead, but more powerful. |
[unknown: 9] 29-Jun-2006 [1062] | Thanks Brian, but as is the theme with questions I ask, I don't ask for myself, but rather that the "world" can learn what "we" know. So perhaps you should add your 2 cents to Henriks, and Tom's in a public forum of the Wikibook. |
Volker 29-Jun-2006 [1063] | the compiling is no big argument as compiler-compilers are for compiled languages anyway ;) the point is, you can mix a grammar and actions for semantics easy. |
BrianH 29-Jun-2006 [1064] | Reichart, I figured as much (hence the "dry" comment). I'll look over the Wikibook and see if I can help. |
Volker 29-Jun-2006 [1065] | Your points are ok,only wanted to try somewhat shorter |
BrianH 29-Jun-2006 [1066] | Volker, it still might be a good point that you can skip a step with parse, depending on the listener. Parse is more of a compiler-interpreter really. The real point I was making was about the lookahead. |
Volker 29-Jun-2006 [1067] | I can plug in handcrafted parsers with some cocos too. |
JaimeVargas 29-Jun-2006 [1068] | I agree brian parrse allows you to write interpreter easily. Regarding compilation I guess it does that too. But the problem is more difficult. |
Volker 29-Jun-2006 [1069] | aah. a compiler-compiler produces sourcecode to be compiled, but you can interpret data with it. |
older newer | first last |