World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
Steeve 1-Dec-2010 [5385]	but it may slow down parse, no ?
BrianH 1-Dec-2010 [5386]	Not much, just one more pointer assignment at alternation.
Steeve 1-Dec-2010 [5387]	Ladislav, are you lost in translation ? Or are you crying :-)
BrianH 1-Dec-2010 [5388x4]	It fails. Here is the test code that I will put in the ticket: >> a: "a" b: "b" parse a [:b "b" (print true) fail \| "a"] true == false ; should be true >> a: "a" b: "b" parse a [:b "b" (print true) fail \| "b"] true == true ; should be false
	So, half of the request succeeds: You can set the position to another series. I wonder if you can change series types from string to block.
	Yup, you can.
	It is not a simple problem though, as not only would you have to add a series reference to the fallback state but you would need to make those series references visible to the garbage collector so they won't be freed; backtracking to a freed series would be bad.
Steeve 1-Dec-2010 [5392x2]	parse is freeing is own allocated ressources currenlty, what would that be a problem to pursue ?
Steeve 1-Dec-2010 [5392x2]	*why would that be...
BrianH 1-Dec-2010 [5394]	What if someone runs RECYCLE in a paren? It would need to know what to not collect.
Steeve 1-Dec-2010 [5395]	I mean, Parse must use a sort of stack to keep the backtracking references. The series will not be freed until parse destroy his stack
BrianH 1-Dec-2010 [5396]	Right now it is a stack of integers (position) and a single pointer (series reference). To do this it would need to be a stack of series references too, and the collector would need to be informed of its exdistence so it could scan it for references.
Steeve 1-Dec-2010 [5397]	That's why I said previously, it may slown down the whole process.
BrianH 1-Dec-2010 [5398]	Yup. The ticket needs to be made either way. If it is rejected it will serve as documentation of the issue.
Ladislav 1-Dec-2010 [5399x2]	It can cause problems with backtracking though - actually, it can't, as can be demonstrated easily
Ladislav 1-Dec-2010 [5399x2]	(when implemented properly, of course)
BrianH 1-Dec-2010 [5401]	Submitted as #1787, with the "when implemented properly" workarounds that Ladislav was mentioning. Note: Just because there is a solution to a problem doesn't make it not a problem - it just makes it a problem that can be solved.
Ladislav 1-Dec-2010 [5402]	aha, so, now the get-words can set parse to a different series (INTO does that as well!), but, what is restored, is just the index, not the series... (except for the return from INTO, when the series is restored as well
BrianH 1-Dec-2010 [5403x2]	Yup. A half-solution, but we have workarounds for the other half :)
BrianH 1-Dec-2010 [5403x2]	One interesting thing is that you can switch from string to block parsing and back mid-rule using series switching :)
Ladislav 1-Dec-2010 [5405]	Well, since it has been solved for INTO, it should suffice to use the already existing INTO solution
BrianH 1-Dec-2010 [5406x2]	Yup, that would be preferred. And please mention that in a ticket comment to #1787 :)
BrianH 1-Dec-2010 [5406x2]	Otherwise I will mention this in a comment and attribute the idea to you :)
Ladislav 1-Dec-2010 [5408]	so, Oldes, you should try this, which should be the exact equivalent of your rule, except for the fact, that it does not call Parse recursively: some [ thru {<h2><a} thru ">" copy name to {<} ; copy the DOC copy doc to {^/ </div>} ; remember the DOC-END doc-end: ; switch to DOC parsing :doc thru {<pre class="code">} copy code to {</pre} ( probe name probe code ) any [ thru {<h5>} copy arg to {<} thru {<ol><p>} copy arg-desc to {</p></ol>} (printf [" * " 10 " - "] reduce [arg arg-desc]) ] ; switch to original input :doc-end ]
BrianH 1-Dec-2010 [5409]	Thanks, Ladislav :)
Steeve 2-Dec-2010 [5410]	Submitted as #1787, with the when implemented properly" workarounds that Ladislav was mentioning. Note: Just because there is a solution to a problem doesn't make it not a problem - it just makes it a problem that can be solved." Geez, I'm not a Sissy ,But I pointed the workaround from the beginning. Sometimes I just have the weird feeling I'm not trusted enough. Sorry, I stop the whinning now :-)
Ladislav 2-Dec-2010 [5411x4]	Yes, Steeve, I know, that this has been discussed a while ago. Nevertheless, it is worth the effort to have it in a comment to the ticket.
	(does not matter much to me who puts it in, though)
	I just wanted to make sure to point at INTO, since it is already implemented, and working fine.
	(and doing the same thing, at least in principle)
BrianH 2-Dec-2010 [5415x3]	Yes, and a good point it was too.
	Steeve, I'm sure that the reason it was so easy for me to come up with workarounds off the top of my head on a weak-brain day was because I had seen them before when you pointed them out and didn't remember it directly. In any case, I'm sure your stuff was great.
	The ticket was for documentation purposes, as well as a request. It was to summarize the conversation from before.
Oldes 2-Dec-2010 [5418x2]	Steeve, Ladislav... sorry, but your version is not working. The main SOME rule finds only one match and than stops. Maybe I should give you a simple test string so you could test it first.
Oldes 2-Dec-2010 [5418x2]	hm.. it works on simple test, don't know why it stops for my real data.
Ladislav 2-Dec-2010 [5420]	that is interesting, (my version differs from Steeve's), but should be as similar to your version as possible
Oldes 2-Dec-2010 [5421x2]	My simplified test is: parse test: {[{1}{2}][{3}{4}][]} [ some [ thru {[} ; copy the DOC copy doc to {]} ; remember the DOC-END doc-end: ; switch to DOC parsing :doc (print "start") any [ thru "{" copy n to "}" (probe n) ] ; switch to original input (print "end") :doc-end ] ] that's working as expected.
Oldes 2-Dec-2010 [5421x2]	I understand the principe, but as I say, on real file it stops.
Ladislav 2-Dec-2010 [5423x3]	may be a Parse bug, e.g.
	so, it is worth testing
	What does "stops" mean, BTW?
Oldes 2-Dec-2010 [5426]	you can test real data as well:) print "loading data" data: read/string http://www.imagemagick.org/api/magick-image.php ask "parsing version 1" parse/all data [ some [ thru {<h2><a} thru ">" copy name to {<} doc-start: to {^/ </div>} doc-end: :doc-start [ thru {<pre class="code">} copy code to {</pre} ( probe name probe code ) any [ thru {<h5>} here: if (lesser? index? here index? doc-end) copy arg to {<} thru {<ol><p>} copy arg-desc to {</p></ol>} (printf [" * " 10 " - "] reduce [arg arg-desc]) ] ] :doc-end ] ] ask "parsing version 2" parse/all data [ some [ thru {<h2><a} thru ">" copy name to {<} ; copy the DOC copy doc to {^/ </div>} ; remember the DOC-END doc-end: ; switch to DOC parsing :doc thru {<pre class="code">} copy code to {</pre} ( probe name probe code ) any [ thru {<h5>} copy arg to {<} thru {<ol><p>} copy arg-desc to {</p></ol>} (printf [" * " 10 " - "] reduce [arg arg-desc]) ] ; switch to original input :doc-end ] ]
Ladislav 2-Dec-2010 [5427x2]	aha, it needs to be written this way: parse/all data [ some [ thru {<h2><a} thru ">" copy name to {<} ; copy the DOC copy doc to {^/ </div>} ; remember the DOC-END doc-end: ; switch to DOC parsing ; we need OPT to be able switch back :doc opt [ thru {<pre class="code">} copy code to {</pre} ( probe name probe code ) any [ thru {<h5>} copy arg to {<} thru {<ol><p>} copy arg-desc to {</p></ol>} (printf [" * " 10 " - "] reduce [arg arg-desc]) ] ] ; switch to original input :doc-end ] ]
Ladislav 2-Dec-2010 [5427x2]	since OPT was needed, it is provable, that the "inner parse" fails sometimes, which does not look desirable, and may provoke your attention, Oldes
Oldes 2-Dec-2010 [5429x2]	I've got it.. there is missing <pre> in the third doc so that's why Steeve's version fails.
Oldes 2-Dec-2010 [5429x2]	(I wonder what they use to document the ImageMagick project.. it does not look like fully automated documentation. There are also some typos in the spec names.)
Steeve 14-Jan-2011 [5431]	I'm working on an incremental lexer able to perform line-by-line analysis of any plain text documents. the idea is to allow editing without having to reparse all the document. The syntactical rules will be regular parse rules easy to understand and to modify, to facilitate the creation of different model of document. Of course, the first target is a rebol parser, but the make-doc format is also in my short range. If anyone already have deep thoughts about the subject, please share your opinions. I will come with a proto soon enough.
BrianH 14-Jan-2011 [5432x2]	Will the REBOL parser be using the R3 incremental parser, TRANSCODE ?
BrianH 14-Jan-2011 [5432x2]	Btw, if you are using PARSE for incremental parsing, watch out for this: http://issue.cc/r3/1787
shadwolf 14-Jan-2011 [5434]	what is the equivalent in R3 of disarm ?
older newer	first last