World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
Geomol 26-Apr-2011 [5583]	I would use string parsing in this case.
onetom 26-Apr-2011 [5584]	i would rather use a separate word as a modifier then. makes things a lot simpler. maybe i would do a pre-processing 1st to break up these words into separate ones
Maxim 26-Apr-2011 [5585]	is this in R2 or R3?
onetom 26-Apr-2011 [5586]	doesn't matter, since im just curious.
Maxim 26-Apr-2011 [5587x3]	in R2, the best way is to set a word to the value you're evaluating, and conditional to that value's pass/fail, you switch the following rule to allow it to continue or track back, so it matches another rule. here is a rather verbose example. note that this can be tweaked to be a shorter rule, but it becomes hard to map out how each part of the rules relate.. here, each part is clearly layed out. rebol [] pass-rule: [none] fail-rule: [thru end] condition-rule: pass-rule parse [ word-a "ere" 835 word-b 15 word 86 bullshit #doglieru3 word-c ][ any [ [ ; this rule only matches words ending with "-?" set val word! [ [ ( val: to-string val either #"-" = pick (back back tail val) 1 [ condition-rule: pass-rule ][ condition-rule: fail-rule ] ) condition-rule (print ["PATTERN WORD:" val]) ] \|[ (print ["Arbitrary word: " val]) ] ] ] \| skip ] ] ask ""
	in R3 there are some new Parse ops which allow to make this almost a one-liner
	(I just don't have time to build you an example... :-p )
Steeve 26-Apr-2011 [5590]	R3 but obfuscated. match: [ some [thru #"-"] skip end (print [w "end with -?"]) \| some [thru #""] end (print [w "end with "]) ] parse [ word-a "ere" 835 word-b 15 word* w86 bullshit* #doglieru3 word-c ][ some [ and change set w word! [(form w)] change [do into match \| skip] w \| skip ] ]
onetom 26-Apr-2011 [5591]	hmm.. thanks a lot guys! so practically i can fail in R2 by trying to match 'none?
Maxim 26-Apr-2011 [5592]	no, by trying to match [thru end]
Maxim 27-Apr-2011 [5593]	none never fails.
onetom 27-Apr-2011 [5594]	oh, so u can't go THRU end only TO end ?
Maxim 27-Apr-2011 [5595x2]	yep.
Maxim 27-Apr-2011 [5595x2]	it going thru end would break space-time, so its not allowed by the interpreter ;-)
Ladislav 27-Apr-2011 [5597]	going thru end would break space-time - it is allowed in R3 and there is no reason to break anything, in fact. It is just about the implementation.
onetom 27-Apr-2011 [5598x2]	Ladislav: any other pass/fail technique in R2?
onetom 27-Apr-2011 [5598x2]	imean dynamic pass/fail
Ladislav 27-Apr-2011 [5600x3]	Did you check the http://en.wikibooks.org/wiki/REBOL_Programming/Language_Features/Parse/Parse_expressions#Parse_idioms article?
	I guess, that the methods described in the idioms section can be used (but have not read the above discussion thoroughly).
	[thru end] is not a good rule to use to fail. A much more reasonable rule is [end skip]
onetom 27-Apr-2011 [5603]	i just read carl article about the call for making that idiom page. but when i checked it didn't have much stuff yet. thx for the reminder!
Maxim 27-Apr-2011 [5604]	Lad [thru end] means exactly the same thing as [end skip]. I don't know why R3 decided to change that, but I find that a regression.
Ladislav 27-Apr-2011 [5605x4]	It does not, Max. [thru end] is supposed to mean: [end \| skip] , i.e. it fails in R2 only because of the faulty implementation
	Err, correcting myself a: [thru end] is supposed to mean the same as a: [end \| skip a]
	And that should never fail
	See the above section.
Maxim 27-Apr-2011 [5609]	thru is supposed to move the cursor past a match, you cannot go past the end, you can only be at the end (the same way you cannot go past the tail).
Ladislav 27-Apr-2011 [5610x3]	Where do you think the cursor is after matching the [end] rule?
	Just try the idiom a: [b \| skip a] and you will see, that it always means the same as a: [thru b]
	(no matter how many characters the B rule matches)
Maxim 27-Apr-2011 [5613]	to doesn't match the end... it moves to it.. its different than simply putting end in a rule. thru is supposed to move PAST the result of a to. >> parse/all "12345" [[to "5"] a: (probe a)] 5 == false >> parse/all "12345" [[thru "5"] a: (probe a)] == true >> parse/all "12345" [[to end] a: (probe a)] == true so if I try to move past the end, its logical that it raises a failure, since it cannot advance one more character.
Ladislav 27-Apr-2011 [5614]	That "advance one more charactef" is where you are wrong. The THRU directive has to stop after matching the rule, not "advance one more character".
Maxim 27-Apr-2011 [5615]	In this case there is no right or wrong, its a question of opinion. There is no after the end, as far as I am concerned.
Ladislav 27-Apr-2011 [5616x4]	Sorry, but it is not a question of opinion.
	There may be just a correct implementation or a bug.
	The fact, that there is no "advance one character" is quite obvious. Every rule matching advances as many characters as the rule being matched prescribes. For example, when matching parse "aaaaa" [before: "aaa" after: to end] index? before ; == 1 index? after ; == 4 the rule matches a three character string and, therefore, the correct position after the match is three characters past the before position (not one character, as you incorrectly stated)
	Being at it, the following case reveals a PARSE implementation bug: parse "aaaaa" ["" to end] ; == false , since the empty string should match.
Maxim 27-Apr-2011 [5620x2]	first of all, your before after parse rule has nothing to do with the to/thru handling. yes, this second rule shows an actual bug... "" and none rules are conceptually equivalent.
Maxim 27-Apr-2011 [5620x2]	to/thru are not matching rules, they are skipping rules. matching rules always return past the match. not at the match, like to will do.
Ladislav 27-Apr-2011 [5622x3]	OK, checking the situation of the NONE rule: parse "aaaaa" [before: none after: to end] index? before; == 1 index? after ; == 1
	So, the parse position before and after the match remained the same, not "one character past the match"
	So, generally, the positions before the match and after the match can differ, optionally, but the difference is not prescribed to be exactly one character.
Maxim 27-Apr-2011 [5625x2]	no the cursor did not advance. there are two concepts at play here... the notion of index, and the notion of "slots" a single slot has two positions, its start and end, but it has only one index.
Maxim 27-Apr-2011 [5625x2]	a better name for slot, probably is segment... just like in video editing.
Ladislav 27-Apr-2011 [5627]	Now, check the PARSE documentation at: http://www.rebol.com/r3/docs/concepts/parsing-summary.html and then we can continue the discussion
Maxim 27-Apr-2011 [5628]	a matching rule, will expand the segment's area, but not its index. rules are stacked based on end-to segments. if a rule has a segment of size 0 (as in the none rule) there is no index change in the next rule segment. i.e. it shares its index since its index is previous index + 0
Ladislav 27-Apr-2011 [5629x3]	to/thru are not matching rules - sorry, once again, an incorrect opinion. They can be used to match the input like every other PARSE rule.
	...and they can either succeed or fail
	Exactly like the idiom a: [b \| skip a], which you did not even try
Maxim 27-Apr-2011 [5632]	well... to/thru are listed alongside skip under "skipping input" in both r2 and r3 docs... they cannot use sub rules, since they only do a find on the input.
older newer	first last