r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Ladislav
1-May-2011
[5730x2]
This is a simple rule:

set w rule

sets the word 'w to the first value matched. No error.
It is quite obvious what the first value matched is.
onetom
1-May-2011
[5732x2]
so, no way to match a complex rule?
s/match/set
Ladislav
1-May-2011
[5734]
RULE might be complex, but what is so strange about setting 'w to 
the first value matched?
onetom
1-May-2011
[5735x2]
it's not transparent what is the 1st value if 'rule is defined somewhere 
else and not inlined
imagine, i define "my own type", like address!
BrianH
1-May-2011
[5737]
That is not the error I was talking about. This is the error:
>> parse [a b] [set w ['a 'b]]
== true
>> ? w                      
W is a word of value: a


It is the attempt to set the value to a complex rule that is the 
error. It wouldn't be an error to do this: parse [a b] [set w 'a 
'b]

If we keep the current behavior, there needs to be a lot of strongly 
worded warnings about the potential gotcha in the PARSE SET docs.
Ladislav
1-May-2011
[5738x3]
It does not matter where the RULE is defined. The first value matched 
is the value at the current position of the cursor, if the match 
occurs, that is.
As said, it does not matter what the RULE is. The first value is 
the first value.
You cannot find anywhere a formulation like "set the value to a rule". 
That is not what can happen.
onetom
1-May-2011
[5741]
address! [string! tuple! hash!]
parse ["cat" 1.2.3 #4] [set addr1 address!]

if im reading this address! looks just like a value reference...
BrianH
1-May-2011
[5742]
It could be considered a useful feature, since the whole match needs 
to match before the SET is performed. However, the docs need to be 
*extremely* precise about this because the "set the value to a rule" 
interpretation is a common misconception among newbie PARSE writers. 
It would be good for the docs to give an example of the type of code 
this allows you to do, explaining the difference.
onetom
1-May-2011
[5743x2]
>> parse [1 2] [set x [integer! issue!]] 
== false
>> x                                    
== 1


this is not i would expect for sure... what are the brackets for 
if they don't have any effect?...
it gives
** Script error: x has no value
in r3 though
Geomol
1-May-2011
[5745]
But we have COPY to do, what you want, if I understand:

>> address!: [string! tuple! issue!]           
== [string! tuple! issue!]
>> parse ["cat" 1.2.3 #4] [copy addr1 address!]
== true
>> addr1
== ["cat" 1.2.3 #4]
BrianH
1-May-2011
[5746]
onetom, It's not "set the variable to a rule", it is instead "match 
a rule then set the variable to the value at the current position 
in the data".
Ladislav
1-May-2011
[5747]
as stated in the "Idioms" section, I think, that

    a: [set b c]

shall be equivalent to:

    f: [(set/any [b] if lesser? index? e index? d [e])]
    a: [and [c d:] e: f :d]
BrianH
1-May-2011
[5748x2]
By "It could be considered a useful feature" I mean that if this 
were not allowed, you would have to write this:
    parse data [set x ['a 'b]]
like this instead:
    parse data [and ['a 'b] set x skip skip]
Sorry, that's what Ladislav said, with different phrasing.
Ladislav
1-May-2011
[5750]
Yes, Brian, you just wrote it in a simpler way
BrianH
1-May-2011
[5751x2]
I was thinking of phrasing for the examples in the docs to explain 
the SET feature.
Those full equivalences are great for someone who really needs to 
know how things work internally (such as when they need to clone 
PARSE), but you need simple examples first in docs for people who 
just want to use PARSE properly. Btw, has anyone started a set of 
full PARSE docs in DocBase? The parse project page could be raided 
for information, but it really doesn't serve as a full parse manual.
Ladislav
1-May-2011
[5753]
Do I understand correctly, that you did not read the


http://en.wikibooks.org/wiki/REBOL_Programming/Language_Features/Parse

article yet?
BrianH
1-May-2011
[5754]
I haven't read the REBOL wikibook yet.
Ladislav
1-May-2011
[5755]
This is the formulation used to document the SET directive:


If the subexpression match succeeds, the set operation sets the given 
variable to the first matched value, while the copy operation copies 
the whole part of the input matched by the given subexpression. For 
a more detailed description see the Parse idioms section.
BrianH
1-May-2011
[5756]
Sounds accurate, if a little intimidating to newbies. I wish we had 
a really good PARSE manual that could turn newbies into experts.
Ladislav
1-May-2011
[5757]
Any newbies feeling intimidated by the formulation?
BrianH
1-May-2011
[5758]
It could be fine, depending on where you put it in the manual. The 
early parts would need to explain the terminology so the latter parts 
can use it.
Ladislav
1-May-2011
[5759]
Well, certainly it should be read, otherwise it is useless.
Geomol
1-May-2011
[5760]
PARSE in R2 seems to have less support for combined keywords than 
R3, as can be seen in this example:

>> parse [] [opt some 'a]
** Script Error: Invalid argument: some

But there is no error, when combining OPT and THRU:

>> parse [] [opt thru 'a]
== false


Should that trigger an error? If no error, it should return true, 
right?
Ladislav
1-May-2011
[5761x2]
No and no.
opt thru 'a

means the same as

[opt thru] 'a
BrianH
1-May-2011
[5763x2]
When R3's TO and THRU should trigger an error, most of the time they 
just don't match instead, for no apparent reason. There's at least 
one ticket for that.
Ladislav, why would [opt thru 'a] not mean [opt [thru 'a]]? Isn't 
that the definition of opt?
Geomol
1-May-2011
[5765x2]
Ladislav, but shouldn't then [opt some 'a] mean [[opt some] 'a] ? 
It gives an error in R2.
And by that I mean [opt some] might mean [opt [some none]]. :-) It's 
hard, this!
Ladislav
1-May-2011
[5767]
aha, sorry, the combinations of keywords like opt thru are undocumented, 
and the behaviour really is unexpected
Geomol
1-May-2011
[5768x3]
Ok, makes sense. Problem is probably too little documentation (design) 
in the details in the first place.
I think, I'll make those errors in my function version of parse. 
Almost done (in the first version similar to R2 version)
Kinda same problem, when combining them this way (still in R2):

>> parse [] [thru opt 'a]
== false
>> parse [a] [thru opt 'a]
== false
BrianH
1-May-2011
[5771]
In R3 that would be an improperly untriggered error, since TO and 
THRU are defined to not take the full gamut of rules, only a subset. 
Probably the same for R2, but a different subset.
Geomol
1-May-2011
[5772x3]
Ok, version 1.0.0 of BPARSE is found here:

http://www.fys.ku.dk/~niclasen/rebol/libs/bparse.r


It's a function version of PARSE, and can only parse blocks for now, 
not strings. It can do more or less all PARSE in R2 can do when parsing 
blocks. I've tried to trigger errors, R2 PARSE doesn't. The purpose 
is to play around with parsing to maybe make a better version than 
the native version and without bugs.
It's not as fast as the timings, I gave here earlier with a very 
early version.
I've thought some more about [thru end], which return false in the 
R2 version, but return true in R3. My version return false as R2, 
but I better understand the R3 way, now I've programmed it. It can 
be seen as, how THRU should be understood (, as also Ladislav said 
something about)? Do we think of

[thru rule]
as
[to rule rule] or
[to rule skip]

? If the TO keyword can handle complex rules like:
parse [a b] [to ['a 'b] ['a 'b]]

then the first might make better sense, and [thru end] should return 
true. But we can't parse like that in R2, so maybe we think more 
of it as the second, and then [thru end] should return false. But 
if you look in my version, I have to catch this special case, where 
END follows THRU, so it takes more code, which isn't good.


In any case, Ladislav's suggestion to use [end skip] as a fail rule 
is much better. If you're not at the end, the first word (end) will 
give false, else the next will fail, as you can't skip past end.
BrianH
1-May-2011
[5775]
END is a zero-length repeatable rule, like NONE, so TO END and THRU 
END should be equivalent.
Geomol
1-May-2011
[5776]
Makes sense, it's just hard to grasp, when used to how R2 parse works.
BrianH
1-May-2011
[5777]
I'd consider that an error in R2's PARSE, but not a fixable one because 
it would change the semantics.
Geomol
1-May-2011
[5778]
Now you have your own function version of parse, that you can make 
work exactly as you wish. :-) And then maybe, when you're satisfied, 
give it to Carl.


It should now also be easier to make C versions of parse for those, 
who make alternatives to REBOL. At least you have a REBOL function 
to start with.
Maxim
1-May-2011
[5779]
did you try it with complex rules?