r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Graham
18-Jul-2008
[2639x2]
I would think you would have to parse/all .. and a space is #" " 
and a tab is #"^-"
or you can use charsets
amacleod
18-Jul-2008
[2641]
Parse/all...works
thanks
Henrik
18-Jul-2008
[2642x2]
amacleod, small tip:
help char!
amacleod
18-Jul-2008
[2644]
thanks
btiffin
21-Aug-2008
[2645]
A long time ago, I offered to try a lecture.  Don't feel worthy. 
 So I thought I'd throw out a few (mis)understandings and have them 
corrected to build up a level of comfort that I wouldn't be leading 
a group of high potential rebols down a garden path.


So; one of the critical mistakes in PARSE can be remembered as  "so 
many", or a butchery of some [ any [ , so many.

some asks for a truth among alternatives and any say's "yep, got 
zero of the thing I was looking for", but doesn't consume anything. 
 SOME says, great and then asks for a truth.  ANY say "yep, got zero 
of the thing I was looking for", and still doesn't move, ready to 
answer yes to every question SOME can ask.  An infinite PARSE loop.


Aside: to protect against infinite loops always start a fresh PARSE 
block with [()   the "immediate block" of the paren! will allow for 
a keyboard escape, and not the more drastic Ctrl-C.


So, I'd like to ask the audience; what other PARSE command sequences 
can cause infinite loops?


end?  and is it only  "end", "to end" but "thru end" will alleviate 
that one?  end end end end being true?

>> parse "" [some [() end end end]]
(escape)
>> parse "" [some [() thru end end end]]
== false
>> parse "" [some [() to end end end]]
(escape)
>> 


Ok, but thru end is false.  Is there an idiom to avoid looping on 
end, but still being true on the first hit?

Other trip ups?
Oldes
21-Aug-2008
[2646x3]
>> parse "" [any [()]]
(escape)
it's one of the most simple ways how to halt rebol if you don't include 
the parens.
These condition are already fixed in R3
Louis
20-Sep-2008
[2649]
x: "12---dflksdf+++fhkw---sd+++sad"


How can I remove everything to "---" thru "+++" to end up with "12fhkwsad"
Anton
20-Sep-2008
[2650x2]
parse x [any [to "---" here: thru "+++" there: (remove/part here 
there) :here]]
Notice, after the remove that I have reset the parse index to the 
beginning of the removed part, ready to continue parsing the rest 
of the data.
Louis
20-Sep-2008
[2652]
Anton, thanks. I'll try that now. Sorry to take so long to respond---I've 
been eating.
Anton
20-Sep-2008
[2653]
No problem, Louis. You're welcome.
Louis
20-Sep-2008
[2654]
Works great! Many, many thanks.
Henrik
28-Sep-2008
[2655x3]
parse [a] ['a] ;== true

parse ['a] reduce [to-lit-word 'a] ; == false (why?)
forget it. I was confused for a second, but is there a way to parse 
that 'a correctly? The same goes for get-word! and set-word!.
I should clarify: I would like to parse a specific get-word!, lit-word! 
or set-word! as opposed to parsing on the type and then checking 
the value in some kind of action afterwards:


parse ['a 'b 'c] ['a 'b 'c] ;== true (I know this is the wrong parser 
block, but it's something to that effect I would like to see)
Anton
28-Sep-2008
[2658x2]
If I remember correctly, this was a problem of parse (and may still 
be)...
You may have to use a workaround.
Henrik
28-Sep-2008
[2660]
thought so :-)
Geomol
28-Sep-2008
[2661]
If you can go with a reduced block, this can work:

parse reduce ['a 'b 'c] ['a 'b 'c]
Henrik
28-Sep-2008
[2662]
what if there are set-words in it? I wanted to parse the content 
of an object, which can be a mixture of word types.
Chris
28-Sep-2008
[2663x2]
Is there any objection to matching type -> checking value other than 
the inconvience?
You could also preprocess the block using an alternative to 'reduce 
--


 parse blk [any [mk: lit-word! (mk: change mk switch mk/1 [...]) :mk 
 | skip]]
BrianH
28-Sep-2008
[2665x4]
In general that restriction of parse is part of an overall pattern 
in REBOL of encouraging you to use lit-words as lit-words rather 
than some other kind of datatype. Lit-words in REBOL are generally 
used to express literal expressions of words, rather than being used 
as a distinct datatype. In general you convert them to words before 
use.
It's usually a bad idea to use lit-words as keywords - they make 
better values. If you are comparing to a particular lit-word value, 
that is using it as a keyword. If any lit-word value would do and 
their meaning is semantic rather than syntactic, that works. In general, 
PARSE is better for determining syntactic stuff - use the DO dialect 
code in the parens for semantic stuff.
Not that I don't want a LIT or LITERAL directive in PARSE that would 
turn off the PARSE-dialect treatment of the next value in the spec.
It would only be for block parsing though.
Anton
10-Oct-2008
[2669x5]
term: [word! | into term]
parse [a b [c]] [some term]  ;== true
parse [a b [c d]] [some term]  ;== false
I'm a bit confused by that.  I need to parse recursively.
duh... never mind.
Solution:
terms: [some [word! | into terms]]
parse [a b [c d]] terms  ;== true
Terry
12-Oct-2008
[2674x2]
blk: [aa "test" bb "two"  cc  "#block"]
rules: [some [cc set cc string! ]]
parse blk rules

no go? 

I have a more complicated rule set that chokes on the "#block" string.. 
does it think it's an issue! ?
... rules looks like this rather.. 
rules: [some ['cc set cc string! ]]
Henrik
12-Oct-2008
[2676]
Your parser would stop at 'aa, since you never specify it in the 
rule block.

Perhaps something like:

rules: [some [['cc set cc string!] | [word! string!]]
sqlab
12-Oct-2008
[2677]
rules: [some [set  ww  word! set ss string! (do reduce [to-set-word 
ww ss]) ]]
Henrik
30-Oct-2008
[2678]
>> parse/all {2008-10-30|"This is" NOK|http://www.example.com}"|"
== ["2008-10-30" "This is" " NOK" "http://www.example.com"]

I caught this on the mailing list. Bug?
sqlab
30-Oct-2008
[2679]
Yes, this is an old bug.
It does not work, if " is next to your delimiter.
Insert a blank, and it works again.
Graham
3-Nov-2008
[2680x3]
This is a result of using parse-xml and some cleanup

[document
	[soapenv:Envelope
		[soapenv:Body
			[ns1:getSpellingSuggestionsResponse
				[getSpellingSuggestionsReturn
					[getSpellingSuggestionsReturn "Penicillin G"]
					[getSpellingSuggestionsReturn "Penicillin V"]
					[getSpellingSuggestionsReturn "Penicillamine"]
					[getSpellingSuggestionsReturn "Polycillin"]
				]
			]
		]
	]
]
what's the cleanest way to extract the drug names?
drugs: [set drugblock into [ 'getSpellingSuggestionsReturn set drugname 
string! ( print drugname) ]]

parse a [ 'document set envelope into [ 'soapEnv:envelope set body 
into [ 'soapEnv:body set response into [ 'ns1:GetSpellingsuggestionsresponse 
set returns into ['getspellingsuggestionsreturn some drugs to end 
]]]]]

works but is very long winded
Gregg
4-Nov-2008
[2683]
It's not so bad Graham. And whether you can shorten things depends 
on how exact you need to be.

rule: [
	'getspellingsuggestionsreturn some drugs
	| url! into rule
]
parse a ['document into rule]
PeterWood
4-Nov-2008
[2684x3]
This is a bit shorter but recursive:

pr: [any
          [
             [set b block! (parse b pr)] 
             |
	 ['getSpellingSuggestionsReturn set s string! (
    
                insert drug-names s

              ) 
           
             | 
            skip
     
      ]
     
]
]
Usage:

>>drug-names: copy []

>> parse gx pr
 
== true

>> drug-names

== ["Polycillin" "Penicillamine" "Penicillin V" "Penicillin G"]
If all you're extracting is the drug names wouldn't it be simpler 
to just parse the XMLstring directly?
Graham
4-Nov-2008
[2687x2]
not sure if it is
<?xml version="1.0" encoding="utf-8" ?> 

- <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
- <soapenv:Body>

- <ns1:getSpellingSuggestionsResponse soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:ns1="http://db.rxnorm.nlm.nih.gov">

- <getSpellingSuggestionsReturn soapenc:arrayType="soapenc:string[4]" 
xsi:type="soapenc:Array" xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/">

  <getSpellingSuggestionsReturn xsi:type="soapenc:string">Penicillin 
  G</getSpellingSuggestionsReturn> 

  <getSpellingSuggestionsReturn xsi:type="soapenc:string">Penicillin 
  V</getSpellingSuggestionsReturn> 

  <getSpellingSuggestionsReturn xsi:type="soapenc:string">Penicillamine</getSpellingSuggestionsReturn> 

  <getSpellingSuggestionsReturn xsi:type="soapenc:string">Polycillin</getSpellingSuggestionsReturn> 
  </getSpellingSuggestionsReturn>
  </ns1:getSpellingSuggestionsResponse>
  </soapenv:Body>
  </soapenv:Envelope>