r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

BrianH
27-Jun-2006
[1001]
Hey, I worked on those rules, they're pretty good :)
Graham
27-Jun-2006
[1002]
shortest .. I mean the least number of words, and operators - not 
in length
JaimeVargas
27-Jun-2006
[1003]
Hahaha, no offense Brian. ;-)
Graham
27-Jun-2006
[1004]
Can't use this problem though .. this group is web public!
BrianH
27-Jun-2006
[1005]
It's too simple anyways.
Graham
27-Jun-2006
[1006x2]
I am hoping that it will be instructive ... as well.
so simple things are good to start with ... we can have harder ones 
once we see how people respond
BrianH
27-Jun-2006
[1008]
Seriously though, three charsets and two temporary variables, there's 
got to be a more efficient way.
Volker
27-Jun-2006
[1009]
should non-alpha be non-captitals? (running in head too)
BrianH
27-Jun-2006
[1010x2]
; Sorry, more fixes
capitals: charset [#"A" - #"Z"]
alpha: charset [#"A" - #"Z" #"a" - #"z"]
non-alpha: complement alpha
parse/all/case [any [to alpha [
    a: 5 capitals any capitals b: [non-alpha | end] (

        b: change/part a rejoin ["<strong>" copy/part a b "</strong>"] b
    ) :b |
    some alpha
]] to end]
No, because that would allow words like this: ABCDEFghij
Volker
27-Jun-2006
[1012x2]
it would collect as long as there are capitals?
and "g" is none
BrianH
27-Jun-2006
[1014]
Well, that would be up to Graham. His original description would 
seem to exclude such words.
Volker
27-Jun-2006
[1015]
means all uppercase?
BrianH
27-Jun-2006
[1016]
As far as I can tell.
Graham
27-Jun-2006
[1017]
Yeah .. all uppercase ..
Volker
27-Jun-2006
[1018]
because " a: 5 capitals any capitals b:" stops at "g" and friends.
BrianH
27-Jun-2006
[1019]
More importantly, it fails at "g" and friends, backtracks and proceeds 
to the next alternate action, some alpha.
Volker
27-Jun-2006
[1020]
Late, but got it. it would enclose "ABCDEF" but should ignore it 
because of the small letters..
BrianH
27-Jun-2006
[1021x2]
Yup. Parse is fun.
You can drop one charset by changing [non-alpha | end] to [alpha 
end skip | end | none] .
Volker
27-Jun-2006
[1023]
would alpha break work?
BrianH
27-Jun-2006
[1024]
No, that would break out of the enclosing all loop. The end skip 
will always fail and proceed to the next alternate.
Tomc
27-Jun-2006
[1025]
capital: charset {ABCDEFGHIJKLMNOPQRSTUVWXYZ}
latipac: complement capital
rule: [
    any latipac here:
    copy token some capital there:
    (all[ 4 < length? token
        insert :there "</strong>"
        insert :here "<strong>"
        there: skip :there 16
    ])
    :there
]
parse/all/case txt [some rule]
Volker
27-Jun-2006
[1026]
problem is "Aa", thats aword, but notan all-uppcase-word. so it should 
be ignored.
Tomc
27-Jun-2006
[1027]
capital: charset {ABCDEFGHIJKLMNOPQRSTUVWXYZ}
latipac:  complement capital
ws: charset { ^/^-}
rule: [
    any latipac here:
    copy token some capital there:
    opt [some ws
        (all[ 4 < length? token
            insert :there "</strong>"
            insert :here  "<strong>"
            there: skip :there 16]
        )
    ]
    :there
]
parse/all/case txt [some rule]
BrianH
27-Jun-2006
[1028x2]
Fails on "aA".
The inserts are a nice touch though.
Tomc
28-Jun-2006
[1030]
capital: charset {ABCDEFGHIJKLMNOPQRSTUVWXYZ}
ws: charset { ^/^-}
latipac: difference complement capital ws


sub-rule: [
	some capital there:
	[ws | end]
	(all[ 4 < length? copy/part :here :there
		insert :there "</strong>"
		insert :here  "<strong>"
		there: skip :there 17]
	)
]
rule: [
	any latipac 
	[	some ws here:
		sub-rule
	]|[skip there:]
	:there
]
parse/all/case txt [here: opt sub-rule some rule]
BrianH
28-Jun-2006
[1031]
Doesn't take into account punctuation in the ws charset. This would 
fail on "HELLO, WORLD!"
Tomc
28-Jun-2006
[1032]
left as an exercise for the reader
BrianH
28-Jun-2006
[1033x2]
:-)
Of course mine doesn't handle words with apostrophes or hyphens in 
them either. Easy fix though, just add ' and - to the capitals charset.
Graham
28-Jun-2006
[1035]
Actually my further spec for this requires the parser to detect spaces 
between capitalised words :)
BrianH
28-Jun-2006
[1036]
And do what?
Graham
28-Jun-2006
[1037]
treat the two capitalised words as one so <strong>HELLO DOLLY</strong>
BrianH
28-Jun-2006
[1038]
What about "HELLO, DOLLY!" or such?
Graham
28-Jun-2006
[1039]
I think that punctuation is part of a word
BrianH
28-Jun-2006
[1040]
For that matter, what about words in quotes?
Graham
28-Jun-2006
[1041]
only if capitalised
BrianH
28-Jun-2006
[1042]
So, no difference.
Graham
28-Jun-2006
[1043x6]
I'll explain the purpose of all this.
A person is writing a text file.  It has headings which are denoted 
by caps, and terminating in ":".
But some headings are two or more words ... with the last terminating 
in ":" only.
Words inside the text, even in caps should not normally be highlighted.
that's the more complete spec.
Anyway, i have a working version now :)
BrianH
28-Jun-2006
[1049]
Well, I hope I helped :)
Graham
28-Jun-2006
[1050]
Yep .. thanks all.