r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

BrianH
30-Sep-2009
[4232]
For those who think PARSE isn't tricky enough to use now, that would 
be great, Maxim.
Steeve
30-Sep-2009
[4233]
i agree
Maxim
30-Sep-2009
[4234]
still, I've never had problems in real-life data so far..., but I 
tend to build very wide & flat rules, instead of using recursive 
rules... I guess that's why.
Steeve
30-Sep-2009
[4235]
It would avoid that i manage my own stack most of the time
Maxim
30-Sep-2009
[4236]
any one done any metrics on actually parse depth limits?
Ladislav
30-Sep-2009
[4237]
re the tail-call optimization: some recursive rules, like e.g. the 
above correct-bracket rule are not tail-optimizable; OTOH, I already 
programmed a couple of algorithms, where I used my own stack to make 
sure it is as big as needed
BrianH
30-Sep-2009
[4238x3]
Ladislav, counters would have to be combined with an explicit stack 
in the mixed bracket case.
As you already know, of course :)
It would be like run-length encoding your explicit stack.
Steeve
30-Sep-2009
[4241]
I remember having requested 3 commands to speed up the management 
of  our own stack in the past [push, pop and rollback]
Maxim
30-Sep-2009
[4242x2]
remark currently uses a stack and stores inner parsing explicitely, 
with offsets& stuff while flattening the inner-most tags first, this 
allows the whole engine to use a single flat 'ANY [ dozens | of | 
rules ].  it loads rebol dialect data within nested < > pairs in 
html ...  works well, but its pretty complex to setup.  would be 
nice to have something standard to code against.
something user types would be well suited for in R3  :-)
BrianH
30-Sep-2009
[4244]
Yeah, I caught the "read-only access to the stack" request too. Useless 
for incremental parsing without being able to restore, basically 
parsing continuations. Which would require a ground-up rewrite of 
PARSE.
Maxim
30-Sep-2009
[4245]
(if/when they will arrive)
BrianH
30-Sep-2009
[4246]
Maxim, that sounds like something I do now without user types.
Maxim
30-Sep-2009
[4247]
push and pop would be the default accessors for set and get.  index 
being the rollback.
BrianH
30-Sep-2009
[4248]
That sounds like standard OOP. User types only map their operations 
to the action! functions.
Steeve
30-Sep-2009
[4249]
Brian, if a special command allow to return the stack [a list of 
positionned blocks], it's not really difficult to perform the continuation, 
i don't need for a special mode to do that.
BrianH
30-Sep-2009
[4250x3]
Assuming you can continue from the top level, I can see that. Continuing 
from further up the stack seems a little tricky to me though...
Assuming yo are using the PARSE stack and not rolling your own.
I wonder if a PARSE optimizer could be written, to reduce use of 
stack and other resources by automatically rewriting the rules.
Maxim
30-Sep-2009
[4253x2]
rollback would be neet as a parse keyword  :-)  it would allow us 
to break several levels at once... something I would have needed 
when I did my XML schema validation engine.
brian, wait a few hours, Gabriele will pop up saying he has one, 
but unfinished and slightly buggy, somewhere on his HD   ;-D
BrianH
30-Sep-2009
[4255x3]
Gabriele always has something like this. His stuff can be difficult 
for mere mortals like me to understand though :(
I swear, it seems like he writes his code using a literate encoding 
engine so that he can remember what it does. Write-only code.
Great code though :)
Steeve
30-Sep-2009
[4258]
With R2

paren: [#"(" paren #")"]

can be replaced by the rule

p: 0
cont: none
start: [#"(" (p: p + 1)]

rule: [start any [
	start
	| #")" (cont: if zero? p: p - 1 ['break]) cont
]]

But i wonder if it's faster
BrianH
30-Sep-2009
[4259]
Likely not. The question isn't faster, it's whether it runs at all.
Steeve
30-Sep-2009
[4260x2]
i'm a little worried now, i'm afraid we can't bound our own words 
with commands in R3
such trick:
(cont: if zero? p: p - 1 ['break]) cont
doesn't work anymore IIRC
Maxim
30-Sep-2009
[4262]
try without the '
Steeve
30-Sep-2009
[4263x3]
i guess not, but i will try
it doesn't work
it's a broblem specificaly for the BREAK command, because it can't 
be encaped in a block, it has to be on the same level than the ANY/SOME 
block to be effective
Maxim
30-Sep-2009
[4266]
yeah, cause it just breaks the block its in.
Steeve
30-Sep-2009
[4267]
but with the IF command it should not be too much worrying
Maxim
30-Sep-2009
[4268x2]
some rules will definitely need to be changed with the three hacks 
he now made officially "illegal"
but that should only affect.... hackers...  ;-)
BrianH
30-Sep-2009
[4270x2]
; R3 style
rule: [(p: 0) any [
    #"(" (++ p) |
    #")" if (1 >= -- p) then break | none
] if (p = 0)]
On one line:

rule: [(p: 0) any [#"(" (++ p) | #")" if (1 >= -- p) then break | 
none] if (p = 0)]
Steeve
30-Sep-2009
[4272]
As i thought, we are saved :-)
BrianH
30-Sep-2009
[4273]
Needs some tweaking though.
Maxim
30-Sep-2009
[4274]
the   [IF () THEN rule | rule ]  reads nicely when used in parse 
rules. I find.
Steeve
30-Sep-2009
[4275]
a smell of old BASIC
Maxim
30-Sep-2009
[4276]
hehe  yeah... that's it  ;-)
Steeve
30-Sep-2009
[4277]
what's that if (p=0) at the tail ?
BrianH
30-Sep-2009
[4278]
Checks to see if the counter is even. Badly, I'm afraid, needs tweaking 
(which I'm doing now).
Maxim
30-Sep-2009
[4279]
why not use  'EVEN?
Steeve
30-Sep-2009
[4280x2]
EVEN? you mean negative ?
it can't be...