r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Pekr
2-Oct-2009
[4371x4]
simply put - you replace original string, right? You put new one 
into the string being replaced. If the new one is of the less length, 
then only such length is being replaced. If the new one is longer, 
then the string is shifted/extended.
But change/part just tells you, how many chars you replace, no? mmnt
>> change/part s: "(1)" "(2222222)"  4 s
== "(2222222)"
hmm ...
Steeve
2-Oct-2009
[4375]
change must replace the complete matched rule, it has nothing to 
do with the quality (length) of the replacement data.
I'm sure it's a bug
Pekr
2-Oct-2009
[4376x3]
then above R2 example is buggy too ...
/part -- Limits the amount to change to a given length or position.
         range -- (Type: number series port pair)
I thought that /part is there to allow us to set limit on number 
of chars being changed. Why in above case the original string got 
extended, when I limited it to 4 chars?
Steeve
2-Oct-2009
[4379]
We are in PARSE here, it has nothing to do with the behavior of CHANGE 
in normal rebol code
Pekr
2-Oct-2009
[4380x3]
Well, I might be confused as well ... not using 'change much ...
why do you repeat it? I try to point out, that it might share internal 
representation, and that might be buggy?
ah, it might be good. I don't know ...
Steeve
2-Oct-2009
[4383]
in parse, CHANGE is a shortcut for REMOVE INSERT.

>> parse s: "(1)" [change "(1)" "()"] ?? s
s: "())"

Should give the same result than:

>> parse s: "(1)" [remove "(1)" insert "()"] ?? s
s: "()"
Pekr
2-Oct-2009
[4384x2]
It clearly behaves as 'change func ...
I have a headache to find out, how 'change behaves in REBOL itself. 
Now if the parse version is supposed to behave even differently, 
then I am completly lost without the trial and error aproach in console, 
and it totally sucks ...
Steeve
2-Oct-2009
[4386]
it's not behaving the same way, if it was the same, we would not 
have this difference:

>> parse s: "(1)." [change "(1)" "(11)"] ?? s
s: "(11)."

>> head change "(1)." "(11)"
== "(11)"

In parse it's a change/part that is performed
Pekr
2-Oct-2009
[4387x2]
And as such, is correct, no?
>> head change/part "(1)." "(11)" 3
== "(11)."
Steeve
2-Oct-2009
[4389]
yep a change/part not a simple change
Pekr
2-Oct-2009
[4390x2]
damned, altme playing on my nerves ... another message lost ....
You are right, it is most probably a bug:

>> head change/part "(1)" "()" 3
== "()"

>> parse s: "(1)" [change "(1)" "()"] s
== "())"
BrianH
2-Oct-2009
[4392x2]
Ladislav, you are right, the revised rule passes my tests: [(p: 0) 
any [#"(" (++ p) | #")" if (1 <= -- p)] if (p = 0)]
Actually, Steeve, the behavior of parse's change does have to do 
with the behavior of change in normal code. That would be an error 
if the CHANGE/part function had that same behavior. It's a bug.
Steeve
2-Oct-2009
[4394x2]
i don't understand what you mean, anyway change in parse has a bug 
:-)
What a mess...

digit: charset "0123456789"
num: [some digit opt [#"." any digit]] 
term: [num | #"(" any lv1 term #")" | #"-" any lv3 term]
calc: [(expr: do expr) stay insert expr (probe e)]
lv4: [
	remove [copy expr [term change #"%" " // " term]] calc
]
lv3: [
	any lv4 
	remove [copy expr [term change #"^^" " ** "any lv4 term]] calc
]
lv2: [
	any lv3 
	remove [copy expr [term change #"*" " * " any lv3 term]] calc
	| remove [copy expr [term change #"/" " / " any lv3 term]] calc
]
lv1: [
	any lv2 
	remove [copy expr [term change #"+" " + " any lv2 term]] calc
	| remove [copy expr [term change #"-" " - " any lv2 term]] calc
]

>> parse probe e: "2+3*2-(-2^^4/6)/2" [some lv1]

2+3*2-(-2^^4/6)/2
2 + 6-(-2^^4/6)/2
8-(-2^^4/6)/2
8 - (-16.0/6)/2
8 - (-2.66666666666667)/2
8 - -1.33333333333334
9.33333333333334


I think i can make that more clean if only the commands AND, CHANGE 
(bugged)  was available.
shadwolf
2-Oct-2009
[4396x3]
steeve nice way to make a quick and fun use of parse ...
I never thought about it
steeeve example should be teached to kids in any schools :P hihihihihihi
i like this example it's short and has many of the parse  features. 
Even if I'm not able to precisely understand how it works i can sense 
globaly that he inserted in his main rule 4 depth parsing level using 
sub rules. He stores sub result of each depth. Each depth is then 
computed and it's result it's passed to the upper level. that's nice 
really ...
Pekr
3-Oct-2009
[4399x5]
Steeve - you used STAY, which is gonna be removed :-) At least Carl 
said so on R3 Chat, it is also as well reflected in updated Parse 
proposal document ...
INTO is marked as being implemented too ... nice ....
I hope we get rest too ... USE, OF, LIMIT look all interesting.
BrianH: has Carl noticed n BREAK? It is not in priority list, and 
it could escape Carl's radar, no?
I added it to the priority list too ....
Ladislav
3-Oct-2009
[4404]
Re N Break: I don't think, that even Break is "organic" to Parse, 
N Break is even more of a mess
Steeve
3-Oct-2009
[4405x2]
And you all missed my (N Fail) proposal.
I just rewrote the math expressions resolver.

digit: charset "0123456789"
num: [some digit opt [#"." any digit]] 
term: [num | #"(" any lv1 term #")" | #"-" any lv3 term]
calc: [
	remove [copy num1 term copy op skip copy num2 term]
	(expr: do reform select [
		"+"  [num1 op num2]
		"-"  [num1 op num2]
		"*"  [num1 op num2]
		"/"  [num1 op num2]
		"^^" [num1 "**" num2]
		"%"  [num1 "//" num2]
		
	] op)
	stay insert expr (probe e)
]
lv4: [term #"%" term then fail | break | calc]
lv3: [any lv4 term #"^^" any lv4 term then fail | break | calc]

lv2: [any lv3 term [#"*" | #"/"] any lv3 term then fail | break | 
calc]

lv1: [any lv2 term [#"+" | #"-"] any lv2 term then fail | break | 
calc]

I just think it's more clear like that.
Moreover, it's prepared to use the further AND command.

Because this nasty trick i use:
[rule THEN FAIL | BREAK | calc]
will be replaced by:
[AND rule calc]
Pekr
4-Oct-2009
[4407]
What is your take on simple mode parsing? It is handy for simple 
CSV parsing, and the idiom is common:

parse/all row ";"


The trouble is, that if there is no data in last column, parse mistakenly 
makes the resulting block shorter, so you have to use common idiom:

rec: parse/all append row  ";" ";"

I always wondered, if it could be regarded being a parse bug?
Henrik
4-Oct-2009
[4408x2]
I wonder now if PARSE could automatically discern newlines, rather 
than having to deal with that in your parser. It would be cool, if 
strings could be considered line-based without specifically having 
to code for that.
PARSE/LINES ? Maybe not.
Pekr
4-Oct-2009
[4410x4]
what would be the advantage?
btw - remember we have deline/enlice natives in R3 now ...
enline
those should replace read/lines iirc
Henrik
4-Oct-2009
[4414x3]
the advantage would be to avoid skipping newlines. now that I think 
of it, you don't want it if you want to parse across a newline, but 
you wouldn't do that for CSV parsing.
enline and deline will help somewhat.
well, my argument seems to be weak. but now the idea is there for 
further study. :-)
Pekr
4-Oct-2009
[4417]
Ladislav - in comment to ticket #1248, you write:


According to the documentation, that can be found in http://www.rebol.net/wiki/Parse_Project

parse "b" [not #"a"]


yields FALSE correctly. If you want to obtain TRUE, you can try e.g.:

parse "b" [not #"a" to end] 


My question is - what it the advantage to actually not advance the 
input on the rule match? It does not look natural and I would expect 
it to match the rule and hence move past it:

>> parse "b" [not #"a" ??]
end!: "b"
== false

... as can be seen, it does not advance ...
Steeve
4-Oct-2009
[4418]
i see, but it's impossible to advance i guess.

NOT (as a pre-rule) is applied on the result of the following rule.
So, #"a" failed (it's not advancing at all).
Then, NOT #"a", reverse the state result.
FAIL become MATCH. That's all
Ladislav
4-Oct-2009
[4419x2]
What is the advantage?:


1) by not consuming input this would be a direct inversion of the 
rule. Example:

    parse ""a" [not end ...]


is a meaningful rule, and it is quite trivial to see, that any rule 
consuming input would not be a direct inversion of this rule.


NOT SOMETHING actually means, that at the current input position 
the SOMETHING rule shall not match. That does not give us any information, 
that NOT should skip any input (how far should it?).

2) This version of NOT is compatible with PEG

3) It is consistent with the AND operation:

   [AND rule] is equivalent to [NOT [NOT rule]]
Yet another example:


    [NOT skip] is equivalent to the [END] rule and is meaningful only, 
    when NOT does not skip any input