r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

BrianH
19-Apr-2011
[5569]
(BREAK) breaks out of the whole parse, as does RETURN; BREAK breaks 
out of an interation, which all enclosing rules are, so it will break 
out of any block, not just ANY, SOME and WHILE blocks.
onetom
26-Apr-2011
[5570]
what's the best practice for recognizing words with certain syntax? 
(besides using the various word variants)

i was thinking about using the datatype notation or similar pre-/postfix 
characters
Geomol
26-Apr-2011
[5571x3]
Can you give us examples of these words, you want to get recognized 
by parse? Then it may be easier to give opinions.
If you mean things like "word!", then this is a way:

alfa: charset [#"a" - #"z" #"A" - #"Z"]
numeric: charset [#"0" - #"9"]
alfa-numeric: union alfa numeric

parse "word!" [copy word [some alfa any alfa-numeric "!"] (probe 
word)]
Will also parse "word2!, but not "1word!".
onetom
26-Apr-2011
[5574x2]
i was not talking about matching string input. that's obvious
parse [some word! other* stuff']
Geomol
26-Apr-2011
[5576]
I'm not sure, I understand, but maybe you mean something like:


>> parse [a few words of type word! "string" 1] [some word! string! 
integer!]
== true

Else it would be easier with an actual example.
onetom
26-Apr-2011
[5577x3]
this was an actual example.

i can match it with [some word!] but i would like to differentiate 
between the words based on what is their last character
i know it's suboptimal, just wondering if still possible somehow 
to reject a matching rule later somehow
parse [xxx*] [set w word! (unless #"*" = last to-string w [doesnt-match])]
Geomol
26-Apr-2011
[5580x2]
Ah ok, then you need to convert the word to a series, for example 
a string, and check on last letter:


parse [some word! other* stuff'] [some [set word word! (print last 
form word)]]
It's tricky to reject a rule later, and PARSE has changed over time, 
so I'm not sure.
onetom
26-Apr-2011
[5582]
ok, this is how i imagine a solution would look like:
>> parse [normal-word  word-with-star*] [word! starred-word!]
== true
Geomol
26-Apr-2011
[5583]
I would use string parsing in this case.
onetom
26-Apr-2011
[5584]
i would rather use a separate word as a modifier then. makes things 
a lot simpler. maybe i would do a pre-processing 1st to break up 
these words into separate ones
Maxim
26-Apr-2011
[5585]
is this in R2 or R3?
onetom
26-Apr-2011
[5586]
doesn't matter, since im just curious.
Maxim
26-Apr-2011
[5587x3]
in R2,  the best way is to set a word to the value you're evaluating, 
and conditional to that value's pass/fail, you switch the following 
rule to allow it to continue or track back, so it matches another 
rule.


here is a rather verbose example.  note that this can be tweaked 
to be a shorter rule, but it becomes hard to map out how each part 
of the rules relate.. here, each part is clearly layed out.

rebol []


pass-rule: [none]
fail-rule: [thru end]

condition-rule: pass-rule



parse [
	word-a
	"ere"
	835
	word-b
	15
	word
	86
	bullshit
	#doglieru3
	word-c
][
	any [
		
		[
			; this rule only matches words ending with "-?"
			set val word! 
			[
				[
					(
						val: to-string val
						either #"-" = pick (back back tail val) 1  [
							condition-rule: pass-rule
						][
							condition-rule: fail-rule
						]
					)
					condition-rule
					(print ["PATTERN WORD:" val])
				]
				|[
					(print ["Arbitrary word: " val])
				]
			]
		]
		
		| skip
	]
]

ask ""
in R3 there are some new Parse ops which allow to make this almost 
a one-liner
(I just don't have time to build you an example... :-p  )
Steeve
26-Apr-2011
[5590]
R3 but obfuscated.

match: [
	  some [thru #"-"] skip end 	(print [w "end with -?"])
	| some [thru #"*"] end		(print [w "end with *"])
]

parse  [
	word-a 	"ere" 	835 	word-b 	15
	word* 	w86	bullshit*	#doglieru3 	word-c
][
	some [
 		  and change set w word! [(form w)]
		  change [do into match | skip] w
		| skip
	]
]
onetom
26-Apr-2011
[5591]
hmm.. thanks a lot guys!
so practically i can fail in R2 by trying to match 'none?
Maxim
26-Apr-2011
[5592]
no, by trying to match  [thru end]
Maxim
27-Apr-2011
[5593]
none    *never* fails.
onetom
27-Apr-2011
[5594]
oh, so u can't go THRU end only TO end ?
Maxim
27-Apr-2011
[5595x2]
yep.
it going thru end would break space-time, so its not allowed by the 
interpreter    ;-)
Ladislav
27-Apr-2011
[5597]
going thru end would break space-time

 - it is allowed in R3 and there is no reason to break anything, in 
 fact. It is just about the implementation.
onetom
27-Apr-2011
[5598x2]
Ladislav: any other pass/fail technique in R2?
imean dynamic pass/fail
Ladislav
27-Apr-2011
[5600x3]
Did you check the


http://en.wikibooks.org/wiki/REBOL_Programming/Language_Features/Parse/Parse_expressions#Parse_idioms

article?
I guess, that the methods described in the idioms section can be 
used (but have not read the above discussion thoroughly).
[thru end] is not a good rule to use to fail. A much more reasonable 
rule is [end skip]
onetom
27-Apr-2011
[5603]
i just read carl article about the call for making that idiom page. 
but when i checked it didn't have much stuff yet. thx for the reminder!
Maxim
27-Apr-2011
[5604]
Lad   [thru end] means *exactly* the same thing as  [end skip].  
 I don't know why R3 decided to change that, but I find that a regression.
Ladislav
27-Apr-2011
[5605x4]
It does not, Max.

[thru end] is supposed to mean:

[end | skip]

, i.e. it fails in R2 only because of the faulty implementation
Err, correcting myself

a: [thru end] is supposed to mean the same as a: [end | skip a]
And that should never fail
See the above section.
Maxim
27-Apr-2011
[5609]
thru is supposed to move the cursor *past* a match,  you cannot go 
past the end, you can only be at the end (the same way you cannot 
go past the tail).
Ladislav
27-Apr-2011
[5610x3]
Where do you think the cursor is after matching the [end] rule?
Just try the idiom

a: [b | skip a]

and you will see, that it always means the same as

a: [thru b]
(no matter how many characters the B rule matches)
Maxim
27-Apr-2011
[5613]
to doesn't match the end... it moves to it..  its different than 
simply putting  end in a rule.  thru is supposed to move PAST the 
result of a to.

>> parse/all "12345" [[to "5"] a: (probe a)]
5
== false
>> parse/all "12345" [[thru "5"] a: (probe a)]

== true

>> parse/all "12345" [[to end] a: (probe a)]

== true


so if I try to move past the end, its logical that it raises a failure, 
since it cannot advance one more character.
Ladislav
27-Apr-2011
[5614]
That "advance one more charactef" is where you are wrong. The THRU 
directive has to stop after matching the rule, not "advance one more 
character".
Maxim
27-Apr-2011
[5615]
In this case there is no right or wrong, its a question of opinion. 
 There is no *after* the end, as far as I am concerned.
Ladislav
27-Apr-2011
[5616x3]
Sorry, but it is not a question of opinion.
There may be just a correct implementation or a bug.
The fact, that there is no "advance one character" is quite obvious. 
Every rule matching advances as many characters as the rule being 
matched prescribes. For example, when matching

    parse "aaaaa" [before: "aaa" after: to end]
    index? before ; == 1
    index? after ; == 4


the rule matches a three character string and, therefore, the correct 
position after the match is three characters past the before position 
(not one character, as you incorrectly stated)