r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

BrianH
30-Sep-2009
[4205]
If you say so, Ladislav, then I look forward to what is to come :)
Steeve
30-Sep-2009
[4206x2]
working here

>> parse "abcd" [any [to ["c" | "b"] ?? skip]]
skip: "bcd"
skip: "cd"
== false
what do you mean by "full" implemented ?
Ladislav
30-Sep-2009
[4208x2]
well, but I do not know if Carl intends to implement the full TO/THRU, 
as I said...
what do I mean? the "full"

    a: [thru b]

can be defined as a "shorcut" for

    a: [b | skip a]


you can try, that this works for any rule B as far as you don't need 
to use too deep a recursion. As opposed to that, the THRU keyword 
does not accept any rule B, as documented in Carl''s blog
BrianH
30-Sep-2009
[4210]
Right. I would just be happy if he adds the not modifier, but the 
current capabilities are moree than I was expecting.
Maxim
30-Sep-2009
[4211]
actually, to/thru multi, is like an ANY with an embedded skip... 
I don't think there is any slowness related to it.
coudn't thru be implemented this way?

skip-rule: [skip]

thru:   [any [ ["c" (skip-rule: [fail]) |  "b" (skip-rule: [fail])] 
skip-rule]]
Steeve
30-Sep-2009
[4212]
but the A: [b | skip A] is weird, i never do that to avoid the stack 
limit error.
this instead:
a: [
    any [
       b break
       | skip
    ]
]
Maxim
30-Sep-2009
[4213]
the above has the advantage of not requiring stack.
Steeve
30-Sep-2009
[4214]
you don't know break Maxim ;-)
Ladislav
30-Sep-2009
[4215x3]
Yes, Steeve, I used it just the recursive expression just for demonstration 
purposes (recursion can be even used to demonstrate how ANY works). 
OTOH, it is a problem to e.g. define a non-recursive rule to detect 
correctly parenthesized string containing just #"(" and #")"
So, another problem of Parse is, that recursion does not work as 
well as it should :-(
(or, as well as may be needed)
BrianH
30-Sep-2009
[4218]
Do you mean the USE proposal, or does it suck in other ways?
Ladislav
30-Sep-2009
[4219x2]
just this: correct-paren: [any [#"(" correct-paren #")"]] - you can 
use it to parse strings, but just "short ones"
or, rather, just the "shallow ones"
BrianH
30-Sep-2009
[4221x2]
Ah yes. For long ones you need a counter in a production.
Which can be checked with IF (condition) now :)
Ladislav
30-Sep-2009
[4223]
hmm, counter does not help you in case like:


correct-bracket: [any [#"(" correct-bracket #")" | #"[" correct-bracket 
#"]"]]
Maxim
30-Sep-2009
[4224]
what is the problem with correct-paren:  ?  you mean it can't go 
beyond  n recursions?  or is it something else?
BrianH
30-Sep-2009
[4225x2]
Not by itself it doesn't. It does kinda make things tricky that recursion 
is implemented with recursion.
Rather than with CPS or tables.
Ladislav
30-Sep-2009
[4227]
yes, Max, the problem is, that the rule does not work when applied 
to deep recursive data
Maxim
30-Sep-2009
[4228]
ok, well that is a general problem of parse, (and of REBOL in general, 
I might add).
BrianH
30-Sep-2009
[4229]
Yup, recursion is bounded and not tail-call optimized :(
Maxim
30-Sep-2009
[4230]
it would be nice to be able to have control over the interpreter 
stack...
Steeve
30-Sep-2009
[4231]
well, i must say i never have such problems, i use rules adapted 
to incremental parsing to avoid too deep recursions.
BrianH
30-Sep-2009
[4232]
For those who think PARSE isn't tricky enough to use now, that would 
be great, Maxim.
Steeve
30-Sep-2009
[4233]
i agree
Maxim
30-Sep-2009
[4234]
still, I've never had problems in real-life data so far..., but I 
tend to build very wide & flat rules, instead of using recursive 
rules... I guess that's why.
Steeve
30-Sep-2009
[4235]
It would avoid that i manage my own stack most of the time
Maxim
30-Sep-2009
[4236]
any one done any metrics on actually parse depth limits?
Ladislav
30-Sep-2009
[4237]
re the tail-call optimization: some recursive rules, like e.g. the 
above correct-bracket rule are not tail-optimizable; OTOH, I already 
programmed a couple of algorithms, where I used my own stack to make 
sure it is as big as needed
BrianH
30-Sep-2009
[4238x3]
Ladislav, counters would have to be combined with an explicit stack 
in the mixed bracket case.
As you already know, of course :)
It would be like run-length encoding your explicit stack.
Steeve
30-Sep-2009
[4241]
I remember having requested 3 commands to speed up the management 
of  our own stack in the past [push, pop and rollback]
Maxim
30-Sep-2009
[4242x2]
remark currently uses a stack and stores inner parsing explicitely, 
with offsets& stuff while flattening the inner-most tags first, this 
allows the whole engine to use a single flat 'ANY [ dozens | of | 
rules ].  it loads rebol dialect data within nested < > pairs in 
html ...  works well, but its pretty complex to setup.  would be 
nice to have something standard to code against.
something user types would be well suited for in R3  :-)
BrianH
30-Sep-2009
[4244]
Yeah, I caught the "read-only access to the stack" request too. Useless 
for incremental parsing without being able to restore, basically 
parsing continuations. Which would require a ground-up rewrite of 
PARSE.
Maxim
30-Sep-2009
[4245]
(if/when they will arrive)
BrianH
30-Sep-2009
[4246]
Maxim, that sounds like something I do now without user types.
Maxim
30-Sep-2009
[4247]
push and pop would be the default accessors for set and get.  index 
being the rollback.
BrianH
30-Sep-2009
[4248]
That sounds like standard OOP. User types only map their operations 
to the action! functions.
Steeve
30-Sep-2009
[4249]
Brian, if a special command allow to return the stack [a list of 
positionned blocks], it's not really difficult to perform the continuation, 
i don't need for a special mode to do that.
BrianH
30-Sep-2009
[4250x3]
Assuming you can continue from the top level, I can see that. Continuing 
from further up the stack seems a little tricky to me though...
Assuming yo are using the PARSE stack and not rolling your own.
I wonder if a PARSE optimizer could be written, to reduce use of 
stack and other resources by automatically rewriting the rules.
Maxim
30-Sep-2009
[4253x2]
rollback would be neet as a parse keyword  :-)  it would allow us 
to break several levels at once... something I would have needed 
when I did my XML schema validation engine.
brian, wait a few hours, Gabriele will pop up saying he has one, 
but unfinished and slightly buggy, somewhere on his HD   ;-D