r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Graham
28-Apr-2006
[940]
So, if Rebol gets the datatype wrong ( and real word data is dirty 
), you're screwed.
Gregg
28-Apr-2006
[941]
That's the tradeoff. :\
Graham
28-Apr-2006
[942x3]
real world data is dirty ..
Maybe there should be no invalid datatypes .... everything can be 
converted to a datatype
if the parser thinks a datatype is invalid, well, let's call it an 
invalid! datatype!!
Gregg
28-Apr-2006
[945]
I think that's where string parsing comes in, and where having rules 
for REBOL datatypes would ease the pain.
Graham
28-Apr-2006
[946x3]
I do screen validation by datatypes ( for data input ).  If the user 
enters an invalid datatype ... ..
anyway, I think rebol should recognise all data ..
have a catchall for stuff it thinks is wrong
Oldes
30-Apr-2006
[949x2]
I agree with you Graham, I was mentioning this many times, that there 
could be something to handle datatype exceptions
About the spaces charset - most people do not know that we have one 
more space char - non braking space:  >> to-char 160 <<
Volker
1-May-2006
[951x2]
How about another way: integrate datatypes in string-parser. Basically 
a  load/next and check for type.
Then we can write (note i parse a string): 
parse "1 a , #2" [ integer! word! "," issue! ]
'invalite! has a problem: its easy to recognize where the wrong part 
starts, but harder to recognize where the wrong part ends.
Oldes
1-May-2006
[953x2]
Is there any RTF (Rich Text Format) parser  for Rebol?
hm, maybe this one: http://www.codeconscious.com/rebol/scripts/rtf-tools.r
:-)
Ashley
24-May-2006
[955]
Quick question for the parse experts. If I have a parse rule that 
looks like this:

parse spec [
	any [
		set arg string! (...) | set arg tuple! (...) | ...
	]
]

How would I add a rule like:

	set arg paren! (reduce arg)


that always occurred prior to the any rule and evaluated parenthesized 
expressions (i.e. I want parenthesized expressions to be reduced 
to a REBOL value that can be handled by the remainder of the parse 
rule).
Tomc
25-May-2006
[956]
I only parse strings  not blocks so this may be compleatly off  but 
I would try
parse spec [
	any[
		opt [here: set arg paren! (change :here reduce arg) :here]
		[	set arg string! (...) | 
			set arg tuple! (...) | ...
		]
	]
]
Anton
25-May-2006
[957]
(here/1: do arg)
Ashley
25-May-2006
[958]
Thanks both, works a treat.
Graham
27-Jun-2006
[959]
My brain is still asleep.  How to go thru a document and add <strong> 
</strong> around every word that is in capitals and is more than 
a few characters long?
Pekr
27-Jun-2006
[960x3]
hmm, quite a challenge ...
somehow to look-up words, mark: before, find its end (another space), 
check for if first is capital or not, change at position, :mark at 
end ...
but don't ask me for code, it would last few hours to get somewhere, 
if even :-)
Graham
27-Jun-2006
[963]
pattern search on capitals, mark, copy to space, mark, count length 
of copy, if long, insert at mark2, and then at mark1, continue ??
Gordon
27-Jun-2006
[964]
I agree - a bit much to ask.  A more specific question would get 
a more specific answer :)

Something like:

file: read filename2parse
newfile: ""
Foreach word file [
   if Is-Capitals Word [
      newfile: join newfile ["<strong> " word " </strong> "]
]

The Is-Capitals function would have to be defined
Is-Capitals func [Word2Check] [
   some code here
]
Graham
27-Jun-2006
[965x2]
that won't work because file is just text and not a block.
but my brain is gradually waking up now ... all I need to do is get 
dressed!
Pekr
27-Jun-2006
[967]
:-)
Volker
27-Jun-2006
[968]
;thinking loud:
capitals: charset["#"A" - #"Z"]
capital: [5 capitals any capitals]
Henrik
27-Jun-2006
[969]
can you do this in one pass?
Gordon
27-Jun-2006
[970]
.Yes "Newfile would have to be "parsed" into words

something like:

Newfile: parse file

or 

file: parse/with file {separator character}
Graham
27-Jun-2006
[971x3]
troubel is, parse doesn't only just parse on " " if specified ...
so, you might lose other characters.
I think this can be done in one pass.
Pekr
27-Jun-2006
[974]
I would not rely on parse helpers, as parse string delimiter, but 
use full parse/all, if you need precise result ...
BrianH
27-Jun-2006
[975]
Yes, give me a minute...
JaimeVargas
27-Jun-2006
[976x2]
capitalize-word: func [

    s [string!]
    /local len

][

    either 5 < len: length? s [

        s: rejoin ["<strong>" uppercase s/1 next s </strong>]

    ][

        s

    ]
 
]



capitalize-text: func [
    s [string!]

    /local result word-rule alpha non-alpha w c
][

    result: copy {}
    alpha: charset [#"A" - #"Z" #"a" - #"z"]

    non-alpha: complement alpha

     word-rule: [copy w [some alpha] (insert tail result capitalize-word 
     w)]
    other-rule: [copy c non-alpha (insert tail result c)]

    parse/all s [some [word-rule | other-rule] end]
    result

]
>> capitalize-text {The result changes according to formating.}  
          

; 
== {The <strong>Result</strong> <strong>Changes</strong> <strong>According</strong> 
to <strong>Formating</strong>.}
Graham
27-Jun-2006
[978x2]
Not quite the problem I was stating!
search for a series of capitalised words and strong them
JaimeVargas
27-Jun-2006
[980]
Ah. Very easy modification.
Graham
27-Jun-2006
[981x2]
bolden-word: func [
    s [string!]
    /local len
][
    either 5 < len: length? s [
        s: rejoin ["<strong>" s </strong>]
    ][
        s
    ]
 ]

enhance-text: func [
    s [string!]
    /local result word-rule alpha non-alpha w c
][
    result: copy {}
    alpha: charset [#"A" - #"Z"]
    non-alpha: complement alpha

    word-rule: [copy w [some alpha] (insert tail result bolden-word w)]
    other-rule: [copy c non-alpha (insert tail result c)]
    parse/all s [some [word-rule | other-rule] end]
    result
]
Thanks Jaime.
BrianH
27-Jun-2006
[983x2]
capitals: charset ["#"A" - #"Z"]
alpha: charset ["#"A" - #"Z" #"a" - #"z"]
non-alpha: complement alpha
parse/all/case [any non-alpha any [
    a: 5 capitals any capitals b: non-alpha (

        b: change/part a rejoin ["<strong>" copy/part a b "</strong>"] b
    ) :b |
    some alpha any non-alpha
] to end]
This is the Parse group after all.
Graham
27-Jun-2006
[985]
ohhhh... shorter :)
BrianH
27-Jun-2006
[986]
; A few fixes
capitals: charset ["#"A" - #"Z"]
alpha: charset ["#"A" - #"Z" #"a" - #"z"]
non-alpha: complement alpha
parse/all/case [any non-alpha any [
    a: 5 capitals any capitals b: [non-alpha | end] (

        b: change/part a rejoin ["<strong>" copy/part a b "</strong>"] b
    ) :b |
    some alpha any non-alpha
] to end]
Graham
27-Jun-2006
[987]
capitals: charset ["#"A" - #"Z"] ... remove  leading "
JaimeVargas
27-Jun-2006
[988]
I think I missunderstood the problem. I thought the the emphasis 
was not for words all in uppercase. Just ther first char.
Graham
27-Jun-2006
[989]
Yeah ... it was a way to mark up text wherever a sequence of CAPS 
occurs