r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Ladislav
30-Sep-2009
[4219x2]
just this: correct-paren: [any [#"(" correct-paren #")"]] - you can 
use it to parse strings, but just "short ones"
or, rather, just the "shallow ones"
BrianH
30-Sep-2009
[4221x2]
Ah yes. For long ones you need a counter in a production.
Which can be checked with IF (condition) now :)
Ladislav
30-Sep-2009
[4223]
hmm, counter does not help you in case like:


correct-bracket: [any [#"(" correct-bracket #")" | #"[" correct-bracket 
#"]"]]
Maxim
30-Sep-2009
[4224]
what is the problem with correct-paren:  ?  you mean it can't go 
beyond  n recursions?  or is it something else?
BrianH
30-Sep-2009
[4225x2]
Not by itself it doesn't. It does kinda make things tricky that recursion 
is implemented with recursion.
Rather than with CPS or tables.
Ladislav
30-Sep-2009
[4227]
yes, Max, the problem is, that the rule does not work when applied 
to deep recursive data
Maxim
30-Sep-2009
[4228]
ok, well that is a general problem of parse, (and of REBOL in general, 
I might add).
BrianH
30-Sep-2009
[4229]
Yup, recursion is bounded and not tail-call optimized :(
Maxim
30-Sep-2009
[4230]
it would be nice to be able to have control over the interpreter 
stack...
Steeve
30-Sep-2009
[4231]
well, i must say i never have such problems, i use rules adapted 
to incremental parsing to avoid too deep recursions.
BrianH
30-Sep-2009
[4232]
For those who think PARSE isn't tricky enough to use now, that would 
be great, Maxim.
Steeve
30-Sep-2009
[4233]
i agree
Maxim
30-Sep-2009
[4234]
still, I've never had problems in real-life data so far..., but I 
tend to build very wide & flat rules, instead of using recursive 
rules... I guess that's why.
Steeve
30-Sep-2009
[4235]
It would avoid that i manage my own stack most of the time
Maxim
30-Sep-2009
[4236]
any one done any metrics on actually parse depth limits?
Ladislav
30-Sep-2009
[4237]
re the tail-call optimization: some recursive rules, like e.g. the 
above correct-bracket rule are not tail-optimizable; OTOH, I already 
programmed a couple of algorithms, where I used my own stack to make 
sure it is as big as needed
BrianH
30-Sep-2009
[4238x3]
Ladislav, counters would have to be combined with an explicit stack 
in the mixed bracket case.
As you already know, of course :)
It would be like run-length encoding your explicit stack.
Steeve
30-Sep-2009
[4241]
I remember having requested 3 commands to speed up the management 
of  our own stack in the past [push, pop and rollback]
Maxim
30-Sep-2009
[4242x2]
remark currently uses a stack and stores inner parsing explicitely, 
with offsets& stuff while flattening the inner-most tags first, this 
allows the whole engine to use a single flat 'ANY [ dozens | of | 
rules ].  it loads rebol dialect data within nested < > pairs in 
html ...  works well, but its pretty complex to setup.  would be 
nice to have something standard to code against.
something user types would be well suited for in R3  :-)
BrianH
30-Sep-2009
[4244]
Yeah, I caught the "read-only access to the stack" request too. Useless 
for incremental parsing without being able to restore, basically 
parsing continuations. Which would require a ground-up rewrite of 
PARSE.
Maxim
30-Sep-2009
[4245]
(if/when they will arrive)
BrianH
30-Sep-2009
[4246]
Maxim, that sounds like something I do now without user types.
Maxim
30-Sep-2009
[4247]
push and pop would be the default accessors for set and get.  index 
being the rollback.
BrianH
30-Sep-2009
[4248]
That sounds like standard OOP. User types only map their operations 
to the action! functions.
Steeve
30-Sep-2009
[4249]
Brian, if a special command allow to return the stack [a list of 
positionned blocks], it's not really difficult to perform the continuation, 
i don't need for a special mode to do that.
BrianH
30-Sep-2009
[4250x3]
Assuming you can continue from the top level, I can see that. Continuing 
from further up the stack seems a little tricky to me though...
Assuming yo are using the PARSE stack and not rolling your own.
I wonder if a PARSE optimizer could be written, to reduce use of 
stack and other resources by automatically rewriting the rules.
Maxim
30-Sep-2009
[4253x2]
rollback would be neet as a parse keyword  :-)  it would allow us 
to break several levels at once... something I would have needed 
when I did my XML schema validation engine.
brian, wait a few hours, Gabriele will pop up saying he has one, 
but unfinished and slightly buggy, somewhere on his HD   ;-D
BrianH
30-Sep-2009
[4255x3]
Gabriele always has something like this. His stuff can be difficult 
for mere mortals like me to understand though :(
I swear, it seems like he writes his code using a literate encoding 
engine so that he can remember what it does. Write-only code.
Great code though :)
Steeve
30-Sep-2009
[4258]
With R2

paren: [#"(" paren #")"]

can be replaced by the rule

p: 0
cont: none
start: [#"(" (p: p + 1)]

rule: [start any [
	start
	| #")" (cont: if zero? p: p - 1 ['break]) cont
]]

But i wonder if it's faster
BrianH
30-Sep-2009
[4259]
Likely not. The question isn't faster, it's whether it runs at all.
Steeve
30-Sep-2009
[4260x2]
i'm a little worried now, i'm afraid we can't bound our own words 
with commands in R3
such trick:
(cont: if zero? p: p - 1 ['break]) cont
doesn't work anymore IIRC
Maxim
30-Sep-2009
[4262]
try without the '
Steeve
30-Sep-2009
[4263x3]
i guess not, but i will try
it doesn't work
it's a broblem specificaly for the BREAK command, because it can't 
be encaped in a block, it has to be on the same level than the ANY/SOME 
block to be effective
Maxim
30-Sep-2009
[4266]
yeah, cause it just breaks the block its in.
Steeve
30-Sep-2009
[4267]
but with the IF command it should not be too much worrying
Maxim
30-Sep-2009
[4268]
some rules will definitely need to be changed with the three hacks 
he now made officially "illegal"