World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
swall 27-Mar-2009 [3622]	Steeve: that seems to have done it. thanks for clarifying.
Gabriele 28-Mar-2009 [3623]	or use #[none] instead
Pavel 29-Mar-2009 [3624]	Gabriele what #[none] really does/means? I've seen it few times having no clue about its functionality.
Henrik 29-Mar-2009 [3625x2]	Pavel, try: mold/all none
Henrik 29-Mar-2009 [3625x2]	it's just a serialized version of none!, so you can load it as a real none value instead of a word.
[unknown: 5] 29-Mar-2009 [3627]	Pavel, this also works with datatypes. For example: >> mold/all string! == "#[datatype! string!]" This is useful if your loading values from a file. This way your sure to set a value to a string datatype! when desired.
Gabriele 31-Mar-2009 [3628]	#[none] is the value of the word 'none. It is the literal representation of the value of type none!.
Pavel 31-Mar-2009 [3629]	THX for description to all
Janko 15-Apr-2009 [3630]	Hi, I have one question .. can you somehow break out of some loop by rebol code .. for example parse [ aa zzz cc ] [ some [ set W word! ( ?? W if equal? W 'zzz [ break ] ) ] ] ... that break doesn't work that way, but is there some way to do this? I need to compare W with a runtime value
Graham 15-Apr-2009 [3631]	throw an error?
Janko 15-Apr-2009 [3632]	I solved it in a way that I can just return out of whole function (with return) at that point so it's ok .. first I had it thought out in a way that I would need to exit the some [ ] loop but continue parsing .. error probably wouldn't work that way either? This is now my code..match: match func [ data rules ] [ parse rules [ SOME [ set L lit-word! ( either equal? L reduce first data [ data: next data ] [ return false ] ) \| set W word! ( set :W first data data: next data ) ] ] ]
Ammon 16-Apr-2009 [3633]	; Here's one way to do it... >> digit: charset "1234567890" == make bitset! #{ 000000000000FF03000000000000000000000000000000000000000000000000 } >> rule: [s: some digit e: (print copy/part s e) \| h: #"a" (h: tail h) :h \| skip ] == [s: some digit e: (print copy/part s e) \| h: #"a" (h: tail h) :h \| skip] >> parse "12b34c56a78" [any rule] 12 34 56 == true
Dockimbel 16-Apr-2009 [3634]	Another possible way is by setting at runtime a [break] rule : branch-rule: [ ] parse [ aa zzz cc ] [ some [ set W word! ( ?? W if equal? W 'zzz [ branch-rule: [ break ] ] ) branch-rule ] ]
Janko 16-Apr-2009 [3635]	Ah, thanks Ammon and Dockimbel! haven't thought of these two ways (well I don't yet fully understant Ammon's)
shadwolf 16-Apr-2009 [3636x5]	charset create a "mask" in bitset form to be compared to the curent item read from the string
	some digit since digit is a bitset containing the binary image of what you looking for (numbers char from 1 to
	that means each content of the string will be compare to the mask and if that mach then you proceed to the calculation
	the equivalent lame would be someting like foreach a string [ either find? "1234567890" a [ append e a ][probe e clear e ] ]
	so the ammon solution using charset / bitset and parse is the totally rebolish way
[unknown: 5] 16-Apr-2009 [3641]	parse [aa zzz cc][some [set w word! (?? w cont: if w = 'zzz [[end skip]]) cont]]
Ammon 17-Apr-2009 [3642x2]	Essentially what I'm doing with the above code is simply skipping to the end of the parse input when a given rule is matched. This works because a get-word in the parse rules sets the current parse input. The get-word can be any value of the same type as the original parse input. You can't set the parse input to a string! if a block! was provided to parse to start with.
Ammon 17-Apr-2009 [3642x2]	Using your code to do the same thing... match func [ data rules ] [ parse rules [ SOME [ set L lit-word! blk: ( either equal? L reduce first data [ data: next data ] [ blk: tail blk ] ) :blk \| set W word! ( set :W first data data: next data ) ] ] ]
Graham 23-Apr-2009 [3644]	I'd like to take an english sentence and tidy it up. I want to automatically apply english grammar to it ... so capitalize the first letter after a period, and remove extraneous spaces eg. a comma after a space. Anyone done anything like this with 'parse?
Ammon 24-Apr-2009 [3645]	Not yet but I've been thinking about it for quite a while now... I think I have a pretty good idea what the parse rules should look like but I haven't written any code for it yet.
Steeve 24-Apr-2009 [3646]	Good start... letter: charset [#"a" - #"z" #"A" - #"Z"] dirt: complement letter word: [some letter] clean: [here: dirt :here (remove here)] space: [here: (insert here #" ") skip] capital: [here: letter (uppercase/part here 1)] sentence: [ some [ capital opt word break \| clean ] any [ [#";" \| #","] any clean space word \| #"." any clean space capital opt word \| #" " word \| clean ] ] parse/all text: {test test . test;; test ..test } sentence probe text >>"Test test. Test; test. Test"
Janko 24-Apr-2009 [3647x2]	I have made auto capitalising first words for some bot once .. it wasn't anything special , I can find the code and send it to you
Janko 24-Apr-2009 [3647x2]	ah, Steeve's already works
Steeve 24-Apr-2009 [3649]	Has to be ehanced indeed
Graham 24-Apr-2009 [3650]	Hey, nice start ...
Steeve 24-Apr-2009 [3651]	indeed, i'm nice
Graham 24-Apr-2009 [3652x2]	:)
Graham 24-Apr-2009 [3652x2]	have to add #"'" ie. ' to the letter charset
Steeve 24-Apr-2009 [3654x2]	#"-" too and what with the numbers ?
Steeve 24-Apr-2009 [3654x2]	for #"'" you should add a rule to remove spaces
Janko 24-Apr-2009 [3656]	Mine was meant so I cold make pretty texts with all upper case in some search engine.. maybe it doesn't work that great in all cases.. smart-uc-after: func [ str sep ] [ parse str [ ANY [ thru sep mark: ( uppercase/part trim mark 1 insert mark " " ) :mark ] ] str ] smart-case: func [ str ] [ calc-with X [ [ lowercase str ] [ uppercase/part X 1 ] [ smart-uc-after X "." ] [ smart-uc-after X "?" ] [ smart-uc-after X "!" ] ]] >> smart-case "HI HOW ARE YOU! we will go. bye!" == "Hi how are you! We will go. Bye! "
Graham 24-Apr-2009 [3657]	numbers aren't usually part of words. Unless it's trademark like 3M
Janko 24-Apr-2009 [3658x2]	but mine is also worse because it does 3 parses instead of one like Steeve
Janko 24-Apr-2009 [3658x2]	calc-with: func [ 'wrd bs ] [ foreach b bs [ set wrd do b ] ] ; it uses this func also
Graham 24-Apr-2009 [3660]	Stevee's looks faster :)
Janko 24-Apr-2009 [3661]	yes, I agree :)
Steeve 24-Apr-2009 [3662x4]	this is the rule for #"-" \| #"'" any clean word
	with that you supress unwanted spaces. it' s a good day --> "it's a good day"
	so don't add ""'" as a vali
	d letter
Graham 24-Apr-2009 [3666]	ahh ...
Steeve 24-Apr-2009 [3667]	do as you want... :-)
Graham 24-Apr-2009 [3668x2]	trailing "." or "," gets lost
Graham 24-Apr-2009 [3668x2]	Also, I think have to add ' to the letter charset because words ending in s can have a trailing ' for possession ...
Steeve 24-Apr-2009 [3670]	but what if they have inserted a space after or before '
Graham 24-Apr-2009 [3671]	so, Miles' wallet and not Miles's wallet
older newer	first last