r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Steeve
30-Sep-2009
[4193]
unless Carl implements the find/not refinement
Ladislav
30-Sep-2009
[4194]
I agree that if we were to dump all of the enhancements but one, 
NOT is the one I would choose :)
 - I guess, that it is my favourite too
BrianH
30-Sep-2009
[4195]
Carl *suggested* the find/not option :)
Maxim
30-Sep-2009
[4196]
'NOT  and the 'TO/'THRU multi.  its just soooo much simpler to slice 
and dice text using it, which is a large percentage of what parse 
is used for.


I remember Carl's historical reason that it made us lazy... hehehe... 
his CR.LF stripping example shows that he is quite lazy himself  
;-)
Steeve
30-Sep-2009
[4197]
TO/THRU multi is not so important in text parsing because we can 
use charsets
Ladislav
30-Sep-2009
[4198]
I am just curious, whether Carl intends to implement the full TO/THRU, 
acting on any subrule
Steeve
30-Sep-2009
[4199]
it's working now in a84
Ladislav
30-Sep-2009
[4200]
It is badluck, that recursive rules are not useful in fact, since 
the stack is too small :-(
BrianH
30-Sep-2009
[4201]
There are likely to be limits. I'm a bit shocked that he was able 
to push it as far as he did :)
Maxim
30-Sep-2009
[4202x2]
steeve, it is... you don't have to build a grammar, just find a set 
of words...
I also remember that Carl feared it would lead to people building 
RE-like slow parsers.
Ladislav
30-Sep-2009
[4204]
full TO/THRU is not implemented yet, trust me
BrianH
30-Sep-2009
[4205]
If you say so, Ladislav, then I look forward to what is to come :)
Steeve
30-Sep-2009
[4206x2]
working here

>> parse "abcd" [any [to ["c" | "b"] ?? skip]]
skip: "bcd"
skip: "cd"
== false
what do you mean by "full" implemented ?
Ladislav
30-Sep-2009
[4208x2]
well, but I do not know if Carl intends to implement the full TO/THRU, 
as I said...
what do I mean? the "full"

    a: [thru b]

can be defined as a "shorcut" for

    a: [b | skip a]


you can try, that this works for any rule B as far as you don't need 
to use too deep a recursion. As opposed to that, the THRU keyword 
does not accept any rule B, as documented in Carl''s blog
BrianH
30-Sep-2009
[4210]
Right. I would just be happy if he adds the not modifier, but the 
current capabilities are moree than I was expecting.
Maxim
30-Sep-2009
[4211]
actually, to/thru multi, is like an ANY with an embedded skip... 
I don't think there is any slowness related to it.
coudn't thru be implemented this way?

skip-rule: [skip]

thru:   [any [ ["c" (skip-rule: [fail]) |  "b" (skip-rule: [fail])] 
skip-rule]]
Steeve
30-Sep-2009
[4212]
but the A: [b | skip A] is weird, i never do that to avoid the stack 
limit error.
this instead:
a: [
    any [
       b break
       | skip
    ]
]
Maxim
30-Sep-2009
[4213]
the above has the advantage of not requiring stack.
Steeve
30-Sep-2009
[4214]
you don't know break Maxim ;-)
Ladislav
30-Sep-2009
[4215x3]
Yes, Steeve, I used it just the recursive expression just for demonstration 
purposes (recursion can be even used to demonstrate how ANY works). 
OTOH, it is a problem to e.g. define a non-recursive rule to detect 
correctly parenthesized string containing just #"(" and #")"
So, another problem of Parse is, that recursion does not work as 
well as it should :-(
(or, as well as may be needed)
BrianH
30-Sep-2009
[4218]
Do you mean the USE proposal, or does it suck in other ways?
Ladislav
30-Sep-2009
[4219x2]
just this: correct-paren: [any [#"(" correct-paren #")"]] - you can 
use it to parse strings, but just "short ones"
or, rather, just the "shallow ones"
BrianH
30-Sep-2009
[4221x2]
Ah yes. For long ones you need a counter in a production.
Which can be checked with IF (condition) now :)
Ladislav
30-Sep-2009
[4223]
hmm, counter does not help you in case like:


correct-bracket: [any [#"(" correct-bracket #")" | #"[" correct-bracket 
#"]"]]
Maxim
30-Sep-2009
[4224]
what is the problem with correct-paren:  ?  you mean it can't go 
beyond  n recursions?  or is it something else?
BrianH
30-Sep-2009
[4225x2]
Not by itself it doesn't. It does kinda make things tricky that recursion 
is implemented with recursion.
Rather than with CPS or tables.
Ladislav
30-Sep-2009
[4227]
yes, Max, the problem is, that the rule does not work when applied 
to deep recursive data
Maxim
30-Sep-2009
[4228]
ok, well that is a general problem of parse, (and of REBOL in general, 
I might add).
BrianH
30-Sep-2009
[4229]
Yup, recursion is bounded and not tail-call optimized :(
Maxim
30-Sep-2009
[4230]
it would be nice to be able to have control over the interpreter 
stack...
Steeve
30-Sep-2009
[4231]
well, i must say i never have such problems, i use rules adapted 
to incremental parsing to avoid too deep recursions.
BrianH
30-Sep-2009
[4232]
For those who think PARSE isn't tricky enough to use now, that would 
be great, Maxim.
Steeve
30-Sep-2009
[4233]
i agree
Maxim
30-Sep-2009
[4234]
still, I've never had problems in real-life data so far..., but I 
tend to build very wide & flat rules, instead of using recursive 
rules... I guess that's why.
Steeve
30-Sep-2009
[4235]
It would avoid that i manage my own stack most of the time
Maxim
30-Sep-2009
[4236]
any one done any metrics on actually parse depth limits?
Ladislav
30-Sep-2009
[4237]
re the tail-call optimization: some recursive rules, like e.g. the 
above correct-bracket rule are not tail-optimizable; OTOH, I already 
programmed a couple of algorithms, where I used my own stack to make 
sure it is as big as needed
BrianH
30-Sep-2009
[4238x3]
Ladislav, counters would have to be combined with an explicit stack 
in the mixed bracket case.
As you already know, of course :)
It would be like run-length encoding your explicit stack.
Steeve
30-Sep-2009
[4241]
I remember having requested 3 commands to speed up the management 
of  our own stack in the past [push, pop and rollback]
Maxim
30-Sep-2009
[4242]
remark currently uses a stack and stores inner parsing explicitely, 
with offsets& stuff while flattening the inner-most tags first, this 
allows the whole engine to use a single flat 'ANY [ dozens | of | 
rules ].  it loads rebol dialect data within nested < > pairs in 
html ...  works well, but its pretty complex to setup.  would be 
nice to have something standard to code against.