World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
Henrik 18-Jul-2008 [2643]	help char!
amacleod 18-Jul-2008 [2644]	thanks
btiffin 21-Aug-2008 [2645]	A long time ago, I offered to try a lecture. Don't feel worthy. So I thought I'd throw out a few (mis)understandings and have them corrected to build up a level of comfort that I wouldn't be leading a group of high potential rebols down a garden path. So; one of the critical mistakes in PARSE can be remembered as "so many", or a butchery of some [ any [ , so many. some asks for a truth among alternatives and any say's "yep, got zero of the thing I was looking for", but doesn't consume anything. SOME says, great and then asks for a truth. ANY say "yep, got zero of the thing I was looking for", and still doesn't move, ready to answer yes to every question SOME can ask. An infinite PARSE loop. Aside: to protect against infinite loops always start a fresh PARSE block with [() the "immediate block" of the paren! will allow for a keyboard escape, and not the more drastic Ctrl-C. So, I'd like to ask the audience; what other PARSE command sequences can cause infinite loops? end? and is it only "end", "to end" but "thru end" will alleviate that one? end end end end being true? >> parse "" [some [() end end end]] (escape) >> parse "" [some [() thru end end end]] == false >> parse "" [some [() to end end end]] (escape) >> Ok, but thru end is false. Is there an idiom to avoid looping on end, but still being true on the first hit? Other trip ups?
Oldes 21-Aug-2008 [2646x3]	>> parse "" [any [()]] (escape)
	it's one of the most simple ways how to halt rebol if you don't include the parens.
	These condition are already fixed in R3
Louis 20-Sep-2008 [2649]	x: "12---dflksdf+++fhkw---sd+++sad" How can I remove everything to "---" thru "+++" to end up with "12fhkwsad"
Anton 20-Sep-2008 [2650x2]	parse x [any [to "---" here: thru "+++" there: (remove/part here there) :here]]
Anton 20-Sep-2008 [2650x2]	Notice, after the remove that I have reset the parse index to the beginning of the removed part, ready to continue parsing the rest of the data.
Louis 20-Sep-2008 [2652]	Anton, thanks. I'll try that now. Sorry to take so long to respond---I've been eating.
Anton 20-Sep-2008 [2653]	No problem, Louis. You're welcome.
Louis 20-Sep-2008 [2654]	Works great! Many, many thanks.
Henrik 28-Sep-2008 [2655x3]	parse [a] ['a] ;== true parse ['a] reduce [to-lit-word 'a] ; == false (why?)
	forget it. I was confused for a second, but is there a way to parse that 'a correctly? The same goes for get-word! and set-word!.
	I should clarify: I would like to parse a specific get-word!, lit-word! or set-word! as opposed to parsing on the type and then checking the value in some kind of action afterwards: parse ['a 'b 'c] ['a 'b 'c] ;== true (I know this is the wrong parser block, but it's something to that effect I would like to see)
Anton 28-Sep-2008 [2658x2]	If I remember correctly, this was a problem of parse (and may still be)...
Anton 28-Sep-2008 [2658x2]	You may have to use a workaround.
Henrik 28-Sep-2008 [2660]	thought so :-)
Geomol 28-Sep-2008 [2661]	If you can go with a reduced block, this can work: parse reduce ['a 'b 'c] ['a 'b 'c]
Henrik 28-Sep-2008 [2662]	what if there are set-words in it? I wanted to parse the content of an object, which can be a mixture of word types.
Chris 28-Sep-2008 [2663x2]	Is there any objection to matching type -> checking value other than the inconvience?
Chris 28-Sep-2008 [2663x2]	You could also preprocess the block using an alternative to 'reduce -- parse blk [any [mk: lit-word! (mk: change mk switch mk/1 [...]) :mk \| skip]]
BrianH 28-Sep-2008 [2665x4]	In general that restriction of parse is part of an overall pattern in REBOL of encouraging you to use lit-words as lit-words rather than some other kind of datatype. Lit-words in REBOL are generally used to express literal expressions of words, rather than being used as a distinct datatype. In general you convert them to words before use.
	It's usually a bad idea to use lit-words as keywords - they make better values. If you are comparing to a particular lit-word value, that is using it as a keyword. If any lit-word value would do and their meaning is semantic rather than syntactic, that works. In general, PARSE is better for determining syntactic stuff - use the DO dialect code in the parens for semantic stuff.
	Not that I don't want a LIT or LITERAL directive in PARSE that would turn off the PARSE-dialect treatment of the next value in the spec.
	It would only be for block parsing though.
Anton 10-Oct-2008 [2669x5]	term: [word! \| into term] parse [a b [c]] [some term] ;== true parse [a b [c d]] [some term] ;== false
	I'm a bit confused by that. I need to parse recursively.
	duh... never mind.
	Solution:
	terms: [some [word! \| into terms]] parse [a b [c d]] terms ;== true
Terry 12-Oct-2008 [2674x2]	blk: [aa "test" bb "two" cc "#block"] rules: [some [cc set cc string! ]] parse blk rules no go? I have a more complicated rule set that chokes on the "#block" string.. does it think it's an issue! ?
Terry 12-Oct-2008 [2674x2]	... rules looks like this rather.. rules: [some ['cc set cc string! ]]
Henrik 12-Oct-2008 [2676]	Your parser would stop at 'aa, since you never specify it in the rule block. Perhaps something like: rules: [some [['cc set cc string!] \| [word! string!]]
sqlab 12-Oct-2008 [2677]	rules: [some [set ww word! set ss string! (do reduce [to-set-word ww ss]) ]]
Henrik 30-Oct-2008 [2678]	>> parse/all {2008-10-30\|"This is" NOK\|http://www.example.com}"\|" == ["2008-10-30" "This is" " NOK" "http://www.example.com"] I caught this on the mailing list. Bug?
sqlab 30-Oct-2008 [2679]	Yes, this is an old bug. It does not work, if " is next to your delimiter. Insert a blank, and it works again.
Graham 3-Nov-2008 [2680x3]	This is a result of using parse-xml and some cleanup [document [soapenv:Envelope [soapenv:Body [ns1:getSpellingSuggestionsResponse [getSpellingSuggestionsReturn [getSpellingSuggestionsReturn "Penicillin G"] [getSpellingSuggestionsReturn "Penicillin V"] [getSpellingSuggestionsReturn "Penicillamine"] [getSpellingSuggestionsReturn "Polycillin"] ] ] ] ] ]
	what's the cleanest way to extract the drug names?
	drugs: [set drugblock into [ 'getSpellingSuggestionsReturn set drugname string! ( print drugname) ]] parse a [ 'document set envelope into [ 'soapEnv:envelope set body into [ 'soapEnv:body set response into [ 'ns1:GetSpellingsuggestionsresponse set returns into ['getspellingsuggestionsreturn some drugs to end ]]]]] works but is very long winded
Gregg 4-Nov-2008 [2683]	It's not so bad Graham. And whether you can shorten things depends on how exact you need to be. rule: [ 'getspellingsuggestionsreturn some drugs \| url! into rule ] parse a ['document into rule]
PeterWood 4-Nov-2008 [2684x3]	This is a bit shorter but recursive: pr: [any [ [set b block! (parse b pr)] \| ['getSpellingSuggestionsReturn set s string! ( insert drug-names s ) \| skip ] ] ]
	Usage: >>drug-names: copy [] >> parse gx pr == true >> drug-names == ["Polycillin" "Penicillamine" "Penicillin V" "Penicillin G"]
	If all you're extracting is the drug names wouldn't it be simpler to just parse the XMLstring directly?
Graham 4-Nov-2008 [2687x6]	not sure if it is
	<?xml version="1.0" encoding="utf-8" ?> - <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> - <soapenv:Body> - <ns1:getSpellingSuggestionsResponse soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:ns1="http://db.rxnorm.nlm.nih.gov"> - <getSpellingSuggestionsReturn soapenc:arrayType="soapenc:string[4]" xsi:type="soapenc:Array" xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"> <getSpellingSuggestionsReturn xsi:type="soapenc:string">Penicillin G</getSpellingSuggestionsReturn> <getSpellingSuggestionsReturn xsi:type="soapenc:string">Penicillin V</getSpellingSuggestionsReturn> <getSpellingSuggestionsReturn xsi:type="soapenc:string">Penicillamine</getSpellingSuggestionsReturn> <getSpellingSuggestionsReturn xsi:type="soapenc:string">Polycillin</getSpellingSuggestionsReturn> </getSpellingSuggestionsReturn> </ns1:getSpellingSuggestionsResponse> </soapenv:Body> </soapenv:Envelope>
	forget about the " - " present ...
	I always find parsing xmlstrings somewhat fragile ....
	I'm not even sure how your parsing works! But it does :)
	the output I presented looks so close to being a rebol object .. and then I can use paths to access the data
older newer	first last