r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

swall
27-Mar-2009
[3622]
Steeve: that seems to have done it. thanks for clarifying.
Gabriele
28-Mar-2009
[3623]
or use #[none] instead
Pavel
29-Mar-2009
[3624]
Gabriele what #[none] really does/means? I've seen it few times having 
no clue about its functionality.
Henrik
29-Mar-2009
[3625x2]
Pavel, try:

mold/all none
it's just a serialized version of none!, so you can load it as a 
real none value instead of a word.
[unknown: 5]
29-Mar-2009
[3627]
Pavel, this also works with datatypes.  For example:

>> mold/all string!
== "#[datatype! string!]"


This is useful if your loading values from a file.  This way your 
sure to set a value to a string datatype! when desired.
Gabriele
31-Mar-2009
[3628]
#[none] is the value of the word 'none. It is the literal representation 
of the value of type none!.
Pavel
31-Mar-2009
[3629]
THX for description to all
Janko
15-Apr-2009
[3630]
Hi, I have one question .. can you somehow break out of some loop 
by rebol code .. for example


parse [ aa zzz cc ]  [ some [ set W word! ( ?? W if equal? W 'zzz 
[ break ] ) ] ]  ...  that break doesn't work that way, but is there 
some way to do this? I need to compare W with a runtime value
Graham
15-Apr-2009
[3631]
throw an error?
Janko
15-Apr-2009
[3632]
I solved it in a way that I can just return out of whole function 
(with return) at that point so it's ok .. first I had it thought 
out in a way that I would need to exit the some [ ] loop but continue 
parsing .. error probably wouldn't work that way either? This is 
now my code..match: 

match func [ data rules ] [
	parse rules [ 
		SOME 

  [ 	set L lit-word! ( either equal? L reduce first data [ data: next 
  data ] [ return false ] ) | 
			set W word! ( set :W first data  data: next data ) 
		] 
	]
]
Ammon
16-Apr-2009
[3633]
; Here's one way to do it...

>> digit: charset "1234567890"
== make bitset! #{
000000000000FF03000000000000000000000000000000000000000000000000
}

>> rule: [s: some digit e: (print copy/part s e) | h: #"a" (h: tail 
h) :h | skip ]

== [s: some digit e: (print copy/part s e) | h: #"a" (h: tail h) 
:h | skip]
>> parse "12b34c56a78" [any rule]
12
34
56
== true
Dockimbel
16-Apr-2009
[3634]
Another possible way is by setting at runtime a [break] rule :

branch-rule: [ ]

parse [ aa zzz cc ]  [ 
	some [ 
		set W word! ( 
			?? W
			if equal? W 'zzz [ branch-rule: [ break ] ]
		)
		branch-rule
	]
]
Janko
16-Apr-2009
[3635]
Ah, thanks Ammon and Dockimbel! haven't thought of these two ways 
(well I don't yet fully understant Ammon's)
shadwolf
16-Apr-2009
[3636x5]
charset create a "mask" in bitset form to be compared to the curent 
item read from the string
some digit since digit is a bitset containing the binary image of 
 what you looking for (numbers char from 1 to
that means each content of the string will be compare to the mask 
and if that mach then you proceed to the calculation
the equivalent lame would be someting like foreach a string [ either 
find?  "1234567890" a [ append e a ][probe e clear e ] ]
so the ammon solution using charset / bitset and parse is the totally 
rebolish way
[unknown: 5]
16-Apr-2009
[3641]
parse [aa zzz cc][some [set w word! (?? w cont: if w = 'zzz [[end 
skip]]) cont]]
Ammon
17-Apr-2009
[3642x2]
Essentially what I'm doing with the above code is simply skipping 
to the end of the parse input when a given rule is matched. This 
works because a get-word in the parse rules sets the current parse 
input.  The get-word can be any value of the same type as the original 
parse input.  You can't set the parse input to a string! if a block! 
was provided to parse to start with.
Using your code to do the same thing...

match func [ data rules ] [
	parse rules [ 
		SOME 

  [ 	set L lit-word! blk: ( either equal? L reduce first data [ data: 
  next data ] [ blk: tail blk ] ) :blk | 
			set W word! ( set :W first data  data: next data ) 
		] 
	]
]
Graham
23-Apr-2009
[3644]
I'd like to take an english sentence and tidy it up.  I want to automatically 
apply english grammar to it ... so capitalize the first letter after 
a period, and remove extraneous spaces eg. a comma after a space. 
 Anyone done anything like this with 'parse?
Ammon
24-Apr-2009
[3645]
Not yet but I've been thinking about it for quite a while now... 
 I think I have a pretty good idea what the parse rules should look 
like but I haven't written any code for it yet.
Steeve
24-Apr-2009
[3646]
Good start...

letter: charset [#"a" - #"z" #"A" - #"Z"]
dirt: complement letter
word: [some letter]
clean: [here: dirt :here (remove here)]
space: [here: (insert here #" ") skip]
capital: [here: letter (uppercase/part here 1)]
sentence: [
	some [
		  capital opt word break
		| clean
	]
	any [
		  [#";" | #","] any clean space word
		| #"." any clean space capital opt word
		| #" " word
		| clean
	]
]

parse/all text: {test  test . test;; test ..test } sentence
probe text
>>"Test test. Test; test. Test"
Janko
24-Apr-2009
[3647x2]
I have made auto capitalising first words for some bot once .. it 
wasn't anything special , I can find the code and send it to you
ah, Steeve's already works
Steeve
24-Apr-2009
[3649]
Has to be ehanced indeed
Graham
24-Apr-2009
[3650]
Hey, nice start ...
Steeve
24-Apr-2009
[3651]
indeed, i'm nice
Graham
24-Apr-2009
[3652x2]
:)
have to add #"'" ie. ' to the letter charset
Steeve
24-Apr-2009
[3654x2]
#"-" too and what with the numbers ?
for #"'" you should add a rule to remove spaces
Janko
24-Apr-2009
[3656]
Mine was meant so I cold make pretty texts with all upper case in 
some search engine.. maybe it doesn't work that great in all cases..

smart-uc-after: func [ str sep ] [

 parse str [ ANY [ thru sep mark: ( uppercase/part trim mark 1 insert 
 mark " " ) :mark ] ]
	str
] 

smart-case: func [ str ] [
	calc-with X [ 	
		[ lowercase str ]
		[ uppercase/part X 1 ]
		[ smart-uc-after X "." ]
		[ smart-uc-after X "?" ]
		[ smart-uc-after X "!" ]
]]
>> smart-case "HI HOW ARE YOU! we will go. bye!"
== "Hi how are you! We will go. Bye! "
Graham
24-Apr-2009
[3657]
numbers aren't usually part of words.  Unless it's trademark like 
3M
Janko
24-Apr-2009
[3658x2]
but mine is also worse because it does 3 parses instead of one like 
Steeve
calc-with: func [ 'wrd bs ] [  foreach b bs [ set wrd do b ] ] ; 
it uses this func also
Graham
24-Apr-2009
[3660]
Stevee's looks faster :)
Janko
24-Apr-2009
[3661]
yes, I agree :)
Steeve
24-Apr-2009
[3662x4]
this is the rule for #"-" 
| #"'" any clean word
with that you supress unwanted spaces.
it'  s a good day
 --> "it's a good day"
so don't add ""'" as a vali
d letter
Graham
24-Apr-2009
[3666]
ahh ...
Steeve
24-Apr-2009
[3667]
do as you want... :-)
Graham
24-Apr-2009
[3668x2]
trailing "." or "," gets lost
Also, I think have to add ' to the letter charset because words ending 
in s can have a trailing ' for possession ...
Steeve
24-Apr-2009
[3670]
but what if they have inserted a space after or before '
Graham
24-Apr-2009
[3671]
so, Miles' wallet and not Miles's wallet