r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

BrianH
1-Dec-2010
[5364]
It has a lot of overhead though (copy overhead).
Steeve
1-Dec-2010
[5365]
I knew, you would say that... :-)
BrianH
1-Dec-2010
[5366]
You have to be careful with INTO string though because there is a 
lot of PARSE code out there that depended on INTO failing with non-blocks, 
and triggering an alternation. Learn to like AND type INTO if your 
code depends on that.
Ladislav
1-Dec-2010
[5367]
Hey you don't like my solution ?

 - I guess, that Oldes does not like it, since it does not "stay in 
 the limit" of DOC
Steeve
1-Dec-2010
[5368]
He just has to switch again that. I don't think he even understood 
or read what I proposed
BrianH
1-Dec-2010
[5369]
Ah, I must have misunderstood the solution then. I thought it was 
a two-pass thing with subparsing of a copy.
Ladislav
1-Dec-2010
[5370]
aha, you would like to use the get-word to switch the input
Steeve
1-Dec-2010
[5371]
Brian, it is
Ladislav
1-Dec-2010
[5372]
I have seen that proposed, but it is not available currently (I would 
support such a proposal, though)
Steeve
1-Dec-2010
[5373x2]
Ladislav, It's working in R3 currently
since a while
Ladislav
1-Dec-2010
[5375]
Checking
BrianH
1-Dec-2010
[5376]
That's what I thought. I originally proposed that kind of input switching 
in 2000. It can cause problems with backtracking though, so a sub-parse 
in an IF operation can be safer.
Steeve
1-Dec-2010
[5377]
Lot of back tracking problems may arrise in a lot of way when you 
do parsing.
I'm not sure it's an argument :-)
BrianH
1-Dec-2010
[5378]
There are no backtracking problems with COPY x TO thing IF (parse 
x rule). But that was the original reason we didn't have input switching. 
The reason I requested input switching in the first place was to 
make it easier to implement the continuous parsing that Pekr was 
requesting at the time :)
Steeve
1-Dec-2010
[5379]
I don'"t see where is the problem, you just have to switch back the 
original serie if the sub-rule fail, no need of if (parse ...) thing
BrianH
1-Dec-2010
[5380]
Originally, only position was reset, not series reference. I would 
welcome tests that show what the current behavior is. Ladislav?
Steeve
1-Dec-2010
[5381x2]
I didn't say the reference was restored automaticly , you have to 
do it yourself.
....
restore: [:switch-serie sub-rule | :restore]
BrianH
1-Dec-2010
[5383]
If it is not restored automatically on failure, backtracking and 
alternation, then that is a problem that needs a ticket submitted 
for it.
Steeve
1-Dec-2010
[5384x2]
Agreed, It would be a nice improvment
but it may slow down parse, no ?
BrianH
1-Dec-2010
[5386]
Not much, just one more pointer assignment at alternation.
Steeve
1-Dec-2010
[5387]
Ladislav, are you  lost in translation ?
Or are you crying :-)
BrianH
1-Dec-2010
[5388x4]
It fails. Here is the test code that I will put in the ticket:
>> a: "a" b: "b" parse a [:b "b" (print true) fail | "a"]
true
== false  ; should be true
>> a: "a" b: "b" parse a [:b "b" (print true) fail | "b"]
true
== true  ; should be false
So, half of the request succeeds: You can set the position to another 
series. I wonder if you can change series types from string to block.
Yup, you can.
It is not a simple problem though, as not only would you have to 
add a series reference to the fallback state but you would need to 
make those series references visible to the garbage collector so 
they won't be freed; backtracking to a freed series would be bad.
Steeve
1-Dec-2010
[5392x2]
parse is freeing is own allocated ressources currenlty, what would 
that be a problem to pursue ?
*why would that be...
BrianH
1-Dec-2010
[5394]
What if someone runs RECYCLE in a paren? It would need to know what 
to not collect.
Steeve
1-Dec-2010
[5395]
I mean, Parse must use a sort of stack to keep the backtracking references. 
The series will not be freed until parse destroy his stack
BrianH
1-Dec-2010
[5396]
Right now it is a stack of integers (position) and a single pointer 
(series reference). To do this it would need to be a stack of series 
references too, and the collector would need to be informed of its 
exdistence so it could scan it for references.
Steeve
1-Dec-2010
[5397]
That's why I said previously, it may slown down the whole process.
BrianH
1-Dec-2010
[5398]
Yup. The ticket needs to be made either way. If it is rejected it 
will serve as documentation of the issue.
Ladislav
1-Dec-2010
[5399x2]
It can cause problems with backtracking though
 - actually, it can't, as can be demonstrated easily
(when implemented properly, of course)
BrianH
1-Dec-2010
[5401]
Submitted as #1787, with the "when implemented properly" workarounds 
that Ladislav was mentioning. Note: Just because there is a solution 
to a problem doesn't make it not a problem - it just makes it a problem 
that can be solved.
Ladislav
1-Dec-2010
[5402]
aha, so, now the get-words can set parse to a different series (INTO 
does that as well!), but, what is restored, is just the index, not 
the series... (except for the return from INTO, when the series is 
restored as well
BrianH
1-Dec-2010
[5403x2]
Yup. A half-solution, but we have workarounds for the other half 
:)
One interesting thing is that you can switch from string to block 
parsing and back mid-rule using series switching :)
Ladislav
1-Dec-2010
[5405]
Well, since it has been solved for INTO, it should suffice to use 
the already existing INTO solution
BrianH
1-Dec-2010
[5406x2]
Yup, that would be preferred. And please mention that in a ticket 
comment to #1787 :)
Otherwise I will mention this in a comment and attribute the idea 
to you :)
Ladislav
1-Dec-2010
[5408]
so, Oldes, you should try this, which should be the exact equivalent 
of your rule, except for the fact, that it does not call Parse recursively:

some [
    thru {<h2><a} thru ">" copy name to {<}
    ; copy the DOC
    copy doc to {^/ </div>}
	; remember the DOC-END
	doc-end:
	; switch to DOC parsing
	:doc
    thru {<pre class="code">} copy code to {</pre} (
        probe name
        probe code
	)
    any [
        thru {<h5>} copy arg to {<}
        thru {<ol><p>} copy arg-desc to {</p></ol>}
        (printf ["  * " 10 " - "] reduce [arg arg-desc])
    ]
    ; switch to original input
    :doc-end
]
BrianH
1-Dec-2010
[5409]
Thanks, Ladislav :)
Steeve
2-Dec-2010
[5410]
Submitted as #1787, with the 

when implemented properly" workarounds that Ladislav was mentioning. 
Note: Just because there is a solution to a problem doesn't make 
it not a problem - it just makes it a problem that can be solved."


Geez, I'm not a Sissy ,But I pointed the workaround  from the beginning. 
Sometimes I just have the weird feeling I'm not trusted enough.
Sorry, I stop the whinning now :-)
Ladislav
2-Dec-2010
[5411x3]
Yes, Steeve, I know, that this has been discussed a while ago. Nevertheless, 
it is worth the effort to have it in a comment to the ticket.
(does not matter much to me who puts it in, though)
I just wanted to make sure to point at INTO, since it is already 
implemented, and working fine.