World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
BrianH 30-Sep-2009 [4195]	Carl suggested the find/not option :)
Maxim 30-Sep-2009 [4196]	'NOT and the 'TO/'THRU multi. its just soooo much simpler to slice and dice text using it, which is a large percentage of what parse is used for. I remember Carl's historical reason that it made us lazy... hehehe... his CR.LF stripping example shows that he is quite lazy himself ;-)
Steeve 30-Sep-2009 [4197]	TO/THRU multi is not so important in text parsing because we can use charsets
Ladislav 30-Sep-2009 [4198]	I am just curious, whether Carl intends to implement the full TO/THRU, acting on any subrule
Steeve 30-Sep-2009 [4199]	it's working now in a84
Ladislav 30-Sep-2009 [4200]	It is badluck, that recursive rules are not useful in fact, since the stack is too small :-(
BrianH 30-Sep-2009 [4201]	There are likely to be limits. I'm a bit shocked that he was able to push it as far as he did :)
Maxim 30-Sep-2009 [4202x2]	steeve, it is... you don't have to build a grammar, just find a set of words...
Maxim 30-Sep-2009 [4202x2]	I also remember that Carl feared it would lead to people building RE-like slow parsers.
Ladislav 30-Sep-2009 [4204]	full TO/THRU is not implemented yet, trust me
BrianH 30-Sep-2009 [4205]	If you say so, Ladislav, then I look forward to what is to come :)
Steeve 30-Sep-2009 [4206x2]	working here >> parse "abcd" [any [to ["c" \| "b"] ?? skip]] skip: "bcd" skip: "cd" == false
Steeve 30-Sep-2009 [4206x2]	what do you mean by "full" implemented ?
Ladislav 30-Sep-2009 [4208x2]	well, but I do not know if Carl intends to implement the full TO/THRU, as I said...
Ladislav 30-Sep-2009 [4208x2]	what do I mean? the "full" a: [thru b] can be defined as a "shorcut" for a: [b \| skip a] you can try, that this works for any rule B as far as you don't need to use too deep a recursion. As opposed to that, the THRU keyword does not accept any rule B, as documented in Carl''s blog
BrianH 30-Sep-2009 [4210]	Right. I would just be happy if he adds the not modifier, but the current capabilities are moree than I was expecting.
Maxim 30-Sep-2009 [4211]	actually, to/thru multi, is like an ANY with an embedded skip... I don't think there is any slowness related to it. coudn't thru be implemented this way? skip-rule: [skip] thru: [any [ ["c" (skip-rule: [fail]) \| "b" (skip-rule: [fail])] skip-rule]]
Steeve 30-Sep-2009 [4212]	but the A: [b \| skip A] is weird, i never do that to avoid the stack limit error. this instead: a: [ any [ b break \| skip ] ]
Maxim 30-Sep-2009 [4213]	the above has the advantage of not requiring stack.
Steeve 30-Sep-2009 [4214]	you don't know break Maxim ;-)
Ladislav 30-Sep-2009 [4215x3]	Yes, Steeve, I used it just the recursive expression just for demonstration purposes (recursion can be even used to demonstrate how ANY works). OTOH, it is a problem to e.g. define a non-recursive rule to detect correctly parenthesized string containing just #"(" and #")"
	So, another problem of Parse is, that recursion does not work as well as it should :-(
	(or, as well as may be needed)
BrianH 30-Sep-2009 [4218]	Do you mean the USE proposal, or does it suck in other ways?
Ladislav 30-Sep-2009 [4219x2]	just this: correct-paren: [any [#"(" correct-paren #")"]] - you can use it to parse strings, but just "short ones"
Ladislav 30-Sep-2009 [4219x2]	or, rather, just the "shallow ones"
BrianH 30-Sep-2009 [4221x2]	Ah yes. For long ones you need a counter in a production.
BrianH 30-Sep-2009 [4221x2]	Which can be checked with IF (condition) now :)
Ladislav 30-Sep-2009 [4223]	hmm, counter does not help you in case like: correct-bracket: [any [#"(" correct-bracket #")" \| #"[" correct-bracket #"]"]]
Maxim 30-Sep-2009 [4224]	what is the problem with correct-paren: ? you mean it can't go beyond n recursions? or is it something else?
BrianH 30-Sep-2009 [4225x2]	Not by itself it doesn't. It does kinda make things tricky that recursion is implemented with recursion.
BrianH 30-Sep-2009 [4225x2]	Rather than with CPS or tables.
Ladislav 30-Sep-2009 [4227]	yes, Max, the problem is, that the rule does not work when applied to deep recursive data
Maxim 30-Sep-2009 [4228]	ok, well that is a general problem of parse, (and of REBOL in general, I might add).
BrianH 30-Sep-2009 [4229]	Yup, recursion is bounded and not tail-call optimized :(
Maxim 30-Sep-2009 [4230]	it would be nice to be able to have control over the interpreter stack...
Steeve 30-Sep-2009 [4231]	well, i must say i never have such problems, i use rules adapted to incremental parsing to avoid too deep recursions.
BrianH 30-Sep-2009 [4232]	For those who think PARSE isn't tricky enough to use now, that would be great, Maxim.
Steeve 30-Sep-2009 [4233]	i agree
Maxim 30-Sep-2009 [4234]	still, I've never had problems in real-life data so far..., but I tend to build very wide & flat rules, instead of using recursive rules... I guess that's why.
Steeve 30-Sep-2009 [4235]	It would avoid that i manage my own stack most of the time
Maxim 30-Sep-2009 [4236]	any one done any metrics on actually parse depth limits?
Ladislav 30-Sep-2009 [4237]	re the tail-call optimization: some recursive rules, like e.g. the above correct-bracket rule are not tail-optimizable; OTOH, I already programmed a couple of algorithms, where I used my own stack to make sure it is as big as needed
BrianH 30-Sep-2009 [4238x3]	Ladislav, counters would have to be combined with an explicit stack in the mixed bracket case.
	As you already know, of course :)
	It would be like run-length encoding your explicit stack.
Steeve 30-Sep-2009 [4241]	I remember having requested 3 commands to speed up the management of our own stack in the past [push, pop and rollback]
Maxim 30-Sep-2009 [4242x2]	remark currently uses a stack and stores inner parsing explicitely, with offsets& stuff while flattening the inner-most tags first, this allows the whole engine to use a single flat 'ANY [ dozens \| of \| rules ]. it loads rebol dialect data within nested < > pairs in html ... works well, but its pretty complex to setup. would be nice to have something standard to code against.
Maxim 30-Sep-2009 [4242x2]	something user types would be well suited for in R3 :-)
BrianH 30-Sep-2009 [4244]	Yeah, I caught the "read-only access to the stack" request too. Useless for incremental parsing without being able to restore, basically parsing continuations. Which would require a ground-up rewrite of PARSE.
older newer	first last