World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Steeve 30-Sep-2009 [4233] | i agree |
Maxim 30-Sep-2009 [4234] | still, I've never had problems in real-life data so far..., but I tend to build very wide & flat rules, instead of using recursive rules... I guess that's why. |
Steeve 30-Sep-2009 [4235] | It would avoid that i manage my own stack most of the time |
Maxim 30-Sep-2009 [4236] | any one done any metrics on actually parse depth limits? |
Ladislav 30-Sep-2009 [4237] | re the tail-call optimization: some recursive rules, like e.g. the above correct-bracket rule are not tail-optimizable; OTOH, I already programmed a couple of algorithms, where I used my own stack to make sure it is as big as needed |
BrianH 30-Sep-2009 [4238x3] | Ladislav, counters would have to be combined with an explicit stack in the mixed bracket case. |
As you already know, of course :) | |
It would be like run-length encoding your explicit stack. | |
Steeve 30-Sep-2009 [4241] | I remember having requested 3 commands to speed up the management of our own stack in the past [push, pop and rollback] |
Maxim 30-Sep-2009 [4242x2] | remark currently uses a stack and stores inner parsing explicitely, with offsets& stuff while flattening the inner-most tags first, this allows the whole engine to use a single flat 'ANY [ dozens | of | rules ]. it loads rebol dialect data within nested < > pairs in html ... works well, but its pretty complex to setup. would be nice to have something standard to code against. |
something user types would be well suited for in R3 :-) | |
BrianH 30-Sep-2009 [4244] | Yeah, I caught the "read-only access to the stack" request too. Useless for incremental parsing without being able to restore, basically parsing continuations. Which would require a ground-up rewrite of PARSE. |
Maxim 30-Sep-2009 [4245] | (if/when they will arrive) |
BrianH 30-Sep-2009 [4246] | Maxim, that sounds like something I do now without user types. |
Maxim 30-Sep-2009 [4247] | push and pop would be the default accessors for set and get. index being the rollback. |
BrianH 30-Sep-2009 [4248] | That sounds like standard OOP. User types only map their operations to the action! functions. |
Steeve 30-Sep-2009 [4249] | Brian, if a special command allow to return the stack [a list of positionned blocks], it's not really difficult to perform the continuation, i don't need for a special mode to do that. |
BrianH 30-Sep-2009 [4250x3] | Assuming you can continue from the top level, I can see that. Continuing from further up the stack seems a little tricky to me though... |
Assuming yo are using the PARSE stack and not rolling your own. | |
I wonder if a PARSE optimizer could be written, to reduce use of stack and other resources by automatically rewriting the rules. | |
Maxim 30-Sep-2009 [4253x2] | rollback would be neet as a parse keyword :-) it would allow us to break several levels at once... something I would have needed when I did my XML schema validation engine. |
brian, wait a few hours, Gabriele will pop up saying he has one, but unfinished and slightly buggy, somewhere on his HD ;-D | |
BrianH 30-Sep-2009 [4255x3] | Gabriele always has something like this. His stuff can be difficult for mere mortals like me to understand though :( |
I swear, it seems like he writes his code using a literate encoding engine so that he can remember what it does. Write-only code. | |
Great code though :) | |
Steeve 30-Sep-2009 [4258] | With R2 paren: [#"(" paren #")"] can be replaced by the rule p: 0 cont: none start: [#"(" (p: p + 1)] rule: [start any [ start | #")" (cont: if zero? p: p - 1 ['break]) cont ]] But i wonder if it's faster |
BrianH 30-Sep-2009 [4259] | Likely not. The question isn't faster, it's whether it runs at all. |
Steeve 30-Sep-2009 [4260x2] | i'm a little worried now, i'm afraid we can't bound our own words with commands in R3 |
such trick: (cont: if zero? p: p - 1 ['break]) cont doesn't work anymore IIRC | |
Maxim 30-Sep-2009 [4262] | try without the ' |
Steeve 30-Sep-2009 [4263x3] | i guess not, but i will try |
it doesn't work | |
it's a broblem specificaly for the BREAK command, because it can't be encaped in a block, it has to be on the same level than the ANY/SOME block to be effective | |
Maxim 30-Sep-2009 [4266] | yeah, cause it just breaks the block its in. |
Steeve 30-Sep-2009 [4267] | but with the IF command it should not be too much worrying |
Maxim 30-Sep-2009 [4268x2] | some rules will definitely need to be changed with the three hacks he now made officially "illegal" |
but that should only affect.... hackers... ;-) | |
BrianH 30-Sep-2009 [4270x2] | ; R3 style rule: [(p: 0) any [ #"(" (++ p) | #")" if (1 >= -- p) then break | none ] if (p = 0)] |
On one line: rule: [(p: 0) any [#"(" (++ p) | #")" if (1 >= -- p) then break | none] if (p = 0)] | |
Steeve 30-Sep-2009 [4272] | As i thought, we are saved :-) |
BrianH 30-Sep-2009 [4273] | Needs some tweaking though. |
Maxim 30-Sep-2009 [4274] | the [IF () THEN rule | rule ] reads nicely when used in parse rules. I find. |
Steeve 30-Sep-2009 [4275] | a smell of old BASIC |
Maxim 30-Sep-2009 [4276] | hehe yeah... that's it ;-) |
Steeve 30-Sep-2009 [4277] | what's that if (p=0) at the tail ? |
BrianH 30-Sep-2009 [4278] | Checks to see if the counter is even. Badly, I'm afraid, needs tweaking (which I'm doing now). |
Maxim 30-Sep-2009 [4279] | why not use 'EVEN? |
Steeve 30-Sep-2009 [4280x2] | EVEN? you mean negative ? |
it can't be... | |
BrianH 30-Sep-2009 [4282] | Even, meaning the ( and ) are balanced. p starts as 0, and should end as 0. |
older newer | first last |