World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Graham 27-Jun-2006 [989] | Yeah ... it was a way to mark up text wherever a sequence of CAPS occurs |
JaimeVargas 27-Jun-2006 [990] | Any how you got the idea. This type of problem could actually use the rewrite rules engine from gabriele. The principle is the same. |
BrianH 27-Jun-2006 [991] | ; Sorry, more fixes capitals: charset ["#"A" - #"Z"] alpha: charset ["#"A" - #"Z" #"a" - #"z"] non-alpha: complement alpha parse/all/case [any [ any non-alpha a: 5 capitals any capitals b: [non-alpha | end] ( b: change/part a rejoin ["<strong>" copy/part a b "</strong>"] b ) :b | some alpha ] to end] |
Graham 27-Jun-2006 [992x2] | Brian ... your rules are incorrect. |
you have extra " in the charset defiitions | |
BrianH 27-Jun-2006 [994] | Right. I was running this in my head, as I don't have test data. REBOL usually catches syntax errors :) |
Graham 27-Jun-2006 [995] | Actually I would like to add a parse problem to the weeklyblog and get people to submit answers :) |
BrianH 27-Jun-2006 [996] | I use parse quite a bit. It's funny, I've never needed the GUI of View, but I use parse daily. |
Graham 27-Jun-2006 [997x2] | And give a prize for the shortest answer |
say a copy of Microsoft VB :) | |
BrianH 27-Jun-2006 [999] | By shortest, go for most efficient. Otherwise variable naming becomes an issue. |
JaimeVargas 27-Jun-2006 [1000] | Shorter may not be clearer or abstract enough. I prefer the something that can become an API an reused. But we will need to exclude the rewrite-rule of Gabriel. ;-) |
BrianH 27-Jun-2006 [1001] | Hey, I worked on those rules, they're pretty good :) |
Graham 27-Jun-2006 [1002] | shortest .. I mean the least number of words, and operators - not in length |
JaimeVargas 27-Jun-2006 [1003] | Hahaha, no offense Brian. ;-) |
Graham 27-Jun-2006 [1004] | Can't use this problem though .. this group is web public! |
BrianH 27-Jun-2006 [1005] | It's too simple anyways. |
Graham 27-Jun-2006 [1006x2] | I am hoping that it will be instructive ... as well. |
so simple things are good to start with ... we can have harder ones once we see how people respond | |
BrianH 27-Jun-2006 [1008] | Seriously though, three charsets and two temporary variables, there's got to be a more efficient way. |
Volker 27-Jun-2006 [1009] | should non-alpha be non-captitals? (running in head too) |
BrianH 27-Jun-2006 [1010x2] | ; Sorry, more fixes capitals: charset [#"A" - #"Z"] alpha: charset [#"A" - #"Z" #"a" - #"z"] non-alpha: complement alpha parse/all/case [any [to alpha [ a: 5 capitals any capitals b: [non-alpha | end] ( b: change/part a rejoin ["<strong>" copy/part a b "</strong>"] b ) :b | some alpha ]] to end] |
No, because that would allow words like this: ABCDEFghij | |
Volker 27-Jun-2006 [1012x2] | it would collect as long as there are capitals? |
and "g" is none | |
BrianH 27-Jun-2006 [1014] | Well, that would be up to Graham. His original description would seem to exclude such words. |
Volker 27-Jun-2006 [1015] | means all uppercase? |
BrianH 27-Jun-2006 [1016] | As far as I can tell. |
Graham 27-Jun-2006 [1017] | Yeah .. all uppercase .. |
Volker 27-Jun-2006 [1018] | because " a: 5 capitals any capitals b:" stops at "g" and friends. |
BrianH 27-Jun-2006 [1019] | More importantly, it fails at "g" and friends, backtracks and proceeds to the next alternate action, some alpha. |
Volker 27-Jun-2006 [1020] | Late, but got it. it would enclose "ABCDEF" but should ignore it because of the small letters.. |
BrianH 27-Jun-2006 [1021x2] | Yup. Parse is fun. |
You can drop one charset by changing [non-alpha | end] to [alpha end skip | end | none] . | |
Volker 27-Jun-2006 [1023] | would alpha break work? |
BrianH 27-Jun-2006 [1024] | No, that would break out of the enclosing all loop. The end skip will always fail and proceed to the next alternate. |
Tomc 27-Jun-2006 [1025] | capital: charset {ABCDEFGHIJKLMNOPQRSTUVWXYZ} latipac: complement capital rule: [ any latipac here: copy token some capital there: (all[ 4 < length? token insert :there "</strong>" insert :here "<strong>" there: skip :there 16 ]) :there ] parse/all/case txt [some rule] |
Volker 27-Jun-2006 [1026] | problem is "Aa", thats aword, but notan all-uppcase-word. so it should be ignored. |
Tomc 27-Jun-2006 [1027] | capital: charset {ABCDEFGHIJKLMNOPQRSTUVWXYZ} latipac: complement capital ws: charset { ^/^-} rule: [ any latipac here: copy token some capital there: opt [some ws (all[ 4 < length? token insert :there "</strong>" insert :here "<strong>" there: skip :there 16] ) ] :there ] parse/all/case txt [some rule] |
BrianH 27-Jun-2006 [1028x2] | Fails on "aA". |
The inserts are a nice touch though. | |
Tomc 28-Jun-2006 [1030] | capital: charset {ABCDEFGHIJKLMNOPQRSTUVWXYZ} ws: charset { ^/^-} latipac: difference complement capital ws sub-rule: [ some capital there: [ws | end] (all[ 4 < length? copy/part :here :there insert :there "</strong>" insert :here "<strong>" there: skip :there 17] ) ] rule: [ any latipac [ some ws here: sub-rule ]|[skip there:] :there ] parse/all/case txt [here: opt sub-rule some rule] |
BrianH 28-Jun-2006 [1031] | Doesn't take into account punctuation in the ws charset. This would fail on "HELLO, WORLD!" |
Tomc 28-Jun-2006 [1032] | left as an exercise for the reader |
BrianH 28-Jun-2006 [1033x2] | :-) |
Of course mine doesn't handle words with apostrophes or hyphens in them either. Easy fix though, just add ' and - to the capitals charset. | |
Graham 28-Jun-2006 [1035] | Actually my further spec for this requires the parser to detect spaces between capitalised words :) |
BrianH 28-Jun-2006 [1036] | And do what? |
Graham 28-Jun-2006 [1037] | treat the two capitalised words as one so <strong>HELLO DOLLY</strong> |
BrianH 28-Jun-2006 [1038] | What about "HELLO, DOLLY!" or such? |
older newer | first last |