World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
Ladislav 22-Sep-2010 [5292x5]	Actually, both variants are implemented, even the one without the overhead (which I implemented first).
	(or, to be more precise, maybe there is a possibility to make a variant not binding the rule at all, which would then deserve to be called "without the overhead" rather than any of my variants)
	But, as you said, one of my motivations was to write it as a mezzanine to have some "inspiration"/experiences with it for Carl.
	, since I guess, that this way, he will not have to just go into an "unknown territory"
	I must say, that I was actually surprised, how people (including me) have struggled to circumvent this problem, while having such an elegant way available to solve it.
GrahamC 18-Oct-2010 [5297]	a regex question ... ([0-9]{4})(-([0-9]{2})(-([0-9]{2})(T([0-9]{2}):([0-9]{2})(:([0-9]{2})(\.([0-9]+))?)?(Z\|(([-+])([0-9]{2}):([0-9]{2})))))) is apparently failing this string : 2010-10-18T07:06:25.00Z What tool can I use to check this string against this regex ?
Sunanda 18-Oct-2010 [5298]	Regexlib has a different ISO-8601 date matching regex: http://regexlib.com/REDetails.aspx?regexp_id=2092 And the ability to enter any regex and target strings to test what happens: http://regexlib.com/RETester.aspx?
GrahamC 18-Oct-2010 [5299x2]	found this one too http://www.fileformat.info/tool/regex.htm
GrahamC 18-Oct-2010 [5299x2]	and it seems my string is passing ... hmm
Sunanda 18-Oct-2010 [5301]	The problem with regexes is they are impossible to debug.....Best just to rewrite continually until they work :)
GrahamC 18-Oct-2010 [5302]	I'm trying to validate some XML against an online validator and it's rejecting my dates :(
Henrik 18-Oct-2010 [5303]	how do you specify an element to be of the type any-type! except none! ?
Ladislav 18-Oct-2010 [5304]	I am afraid, that you need to list all types excluding none
Henrik 18-Oct-2010 [5305]	does R3 solve this? if not, maybe that would be a good problem to solve.
Ladislav 18-Oct-2010 [5306]	R3 can let you define that typeset and use it any time you like
Henrik 18-Oct-2010 [5307]	ok, that is possibly good enough for generating specs.
Gregg 18-Oct-2010 [5308]	I don't remember what all we did Henrik, but some of our test generation stuff on another world had some support for typesets IIRC.
Henrik 18-Oct-2010 [5309]	Gregg, ok
Steeve 18-Oct-2010 [5310]	Henrik, with a parse rule ?
Henrik 18-Oct-2010 [5311]	Steeve, yes.
Steeve 18-Oct-2010 [5312]	R3 does it
AdrianS 18-Oct-2010 [5313]	Graham, try http://gskinner.com/RegExrfor working out regexes. It has a really nice UI where you can hover over the components of the regex and see exactly what they do.
GrahamC 18-Oct-2010 [5314]	Thanks
Sunanda 4-Nov-2010 [5315]	Question on StackOverflow.....there must be a better answer than mine, and I'd suspect it involves PARSE (better answers usually do:) http://stackoverflow.com/questions/4093714/is-there-finer-granularity-than-load-next-for-reading-structured-data
GrahamC 4-Nov-2010 [5316x3]	Use fixed length records
	Anyone got a parse rule that strips out everything between tags in an "xml" document
	whitespace: charset [ "^/^- " ] swsp: [ any whitespace ] result: copy "" parse/all pqri-xml [ some [ copy t thru ">" (append result t) swsp to "<" ]]
Ladislav 4-Nov-2010 [5319]	Posted an answer mentioning the test framework, which does almost exactly what Fork asked
Gabriele 5-Nov-2010 [5320x3]	also, Carl's clean-script and script colorizer use parse + load/next to do the same thing. my Wetan uses the same method.
	http://www.colellachiara.com/soft/MD3/emitters/wetan.html#section-4.2
	basically, as long as you skip over [, (, ), and ] you can just use load/next. I'm also skipping over #[ because I want to preserve literal values while formatting (that is, preserve what the user typed)
Oldes 1-Dec-2010 [5323]	How to use the new INTO parse keyword? Could it be used to avoid the temp parse like in this (very simplified example)? parse "<a>123</a>" [thru "<a>" copy tmp to "</a>" (probe tmp probe parse tmp ["123"]) to end] Note that I know that in this example it's easy to use just one parse and avoid the temp.
Ladislav 1-Dec-2010 [5324x3]	INTO is neither new, not it is meant for string parsing
	You can take advantage of using it when parsing a block and needing to parse a subblock (of any-block! type) or a substring
	(of the said block)
Oldes 1-Dec-2010 [5327]	can you give me a simple example, please?
Ladislav 1-Dec-2010 [5328x2]	>> parse [a b "123" c] [2 word! into [3 skip] word!] == true
Ladislav 1-Dec-2010 [5328x2]	>> parse [a b c/d/e] [2 word! into [3 word!]] == true
Oldes 1-Dec-2010 [5330x2]	I understand now, thanks.
Oldes 1-Dec-2010 [5330x2]	it's very useful, I woder why I've not found it earlier :)
Ladislav 1-Dec-2010 [5332]	The substring property is just a recent addition
Oldes 1-Dec-2010 [5333]	And is there any nice solution for my string parsing above? I can live with the temps, just was thinking if it could be done better.. anyway, at least I know how to use INTO:)
Ladislav 1-Dec-2010 [5334x2]	That is normally a "job" for a subrule
Ladislav 1-Dec-2010 [5334x2]	it looks, that you could use e.g. the REJECT keyword
Oldes 1-Dec-2010 [5336x2]	I know, but that would require complex rules, I'm lazy parser:) Btw.. my real example looks like: some [ thru {<h2><a} thru ">" copy name to {<} copy doc to {^/ </div>} ( parse doc [ thru {<pre class="code">} copy code to {</pre} ( probe name probe code ) any [ thru {<h5>} copy arg to {<} thru {<ol><p>} copy arg-desc to {</p></ol>} ( printf [" * " 10 " - "] reduce [arg arg-desc] ) ] ] ) ]
Oldes 1-Dec-2010 [5336x2]	Never mind, I can live with current way anyway.. I was just wondering if the INTO is not intended for such a cases. Now I know it isn't.
Ladislav 1-Dec-2010 [5338x3]	For comparison, a similar rule can be written as follows: some [ thru {<h2><a} thru ">" copy name to {<} copy doc any [ and {^/ </div>} break \| thru {<pre class="code">} copy code to {</pre} ( probe name probe code ) any [ thru {<h5>} copy arg to {<} thru {<ol><p>} copy arg-desc to {</p></ol>} (printf [" * " 10 " - "] reduce [arg arg-desc]) ] \| skip ] ]
	Aha, sorry, that is not similar enough :-( To be similar, it should look as follows, I guess: some [ thru {<h2><a} thru ">" copy name to {<} copy doc any [ thru {<pre class="code">} copy code to {</pre} ( probe name probe code ) any [ thru {<h5>} copy arg to {<} thru {<ol><p>} copy arg-desc to {</p></ol>} (printf [" * " 10 " - "] reduce [arg arg-desc]) ] to {^/ </div>} ] ]
	Still not cigar, third time: some [ thru {<h2><a} thru ">" copy name to {<} copy doc [ thru {<pre class="code">} copy code to {</pre} ( probe name probe code ) any [ thru {<h5>} copy arg to {<} thru {<ol><p>} copy arg-desc to {</p></ol>} (printf [" * " 10 " - "] reduce [arg arg-desc]) ] to {^/ </div>} ] ]
Oldes 1-Dec-2010 [5341]	That's not correct.. there is a reason for the temp parse and that's here because thru "<h5" would skip out of the div.
older newer	first last