World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
Graham 20-Mar-2005 [144]	How ?
Vincent 20-Mar-2005 [145]	header-rule: [ "Date:" copy m-date to newline \| "From:" copy m-from to newline \| "Subject:" copy m-subject to newline \| "To:" copy m-to to newline \| to newline ] m-subject: m-date: m-from: m-to: none parse header [header-rule some [ thru "^/" header-rule]]
Graham 20-Mar-2005 [146]	oh, I see ...
Vincent 20-Mar-2005 [147]	else Date: ... X-Something: ... ; break the rule To: ... From: ...
Brett 20-Mar-2005 [148]	If you are testing "^/" I would think that you need to use parse/all. You may find my script helpful for visualising the effect of your rules: http://www.rebol.org/cgi-bin/cgiwrap/rebol/documentation.r?script=parse-analysis-view.r
Vincent 20-Mar-2005 [149]	oops - you're right, I missed the big one.
Graham 20-Mar-2005 [150]	so, is PCRE easier to understand ??
Tomc 20-Mar-2005 [151]	$&^&#&%(_&$*@#@
Graham 20-Mar-2005 [152]	looks like perl
Tomc 20-Mar-2005 [153]	that is just random chars not a pcre for paesing mail headers
Graham 20-Mar-2005 [154x2]	Oh :)
Graham 20-Mar-2005 [154x2]	I was just attempting to bring the subject back on topic before I interrupted it.
Tomc 20-Mar-2005 [156]	that was not an interuprion , more liks exactly what this group is for
Graham 20-Mar-2005 [157]	since I have no idea what pcre was ..
Tomc 20-Mar-2005 [158x5]	. match any sigle char but newline
	* 0 or more of the precedding
	() pit in var $n [n1,2,3 ...]
	/T0: (.*)
	$1 has to whom the email is addressed
Graham 20-Mar-2005 [163]	While we're here .. what this taint thing that Perl has, and is it a concern for Rebol ?
Tomc 20-Mar-2005 [164]	tainting forces you to consider the users input and explicitly allow it to pass
Anton 20-Mar-2005 [165]	I think only people who miss it want it. :)
BrianW 20-Mar-2005 [166]	Taint mode tells Perl that you aren't sure whether your incoming data is safe. It's just a shortcut for enforcing commonsense programming.
Graham 20-Mar-2005 [167]	so, it's to prevent incoming data being executed ?
Tomc 20-Mar-2005 [168x2]	you can write a well considered script without taint that is far more secure than a script that passes taint mode by making a simple rule that does not properly catch problems
Tomc 20-Mar-2005 [168x2]	you basicky have to weite a regular expression to accept user input
Vincent 20-Mar-2005 [170]	Graham: for your header, like Brett said, parse/all is needed when you work on strings with newlines and spaces. last line should be: parse/all header [header-rule some [ thru "^/" header-rule]]
BrianW 20-Mar-2005 [171]	Graham, yes, but it's also used in other situations: force the programmer to escape HTML input before printing it back out, massaging data so that it's friendlier for the database, etc.
Graham 20-Mar-2005 [172]	Yeah, I got that Vincent. Curiously though it has worked without it.
Tomc 20-Mar-2005 [173x2]	in your example having a rule more like header-rule: [ "Date:" copy date-rule \| "From:" copy email-rule \| "Subject:" copy some alpha-num \| "To:" copy email-rule \| to newline ] where email-rule only matched email addresses would more taint like
Tomc 20-Mar-2005 [173x2]	and being very careful to never effectivly do [ user-input] without being sure user-input could not cause unintended side effectd
Chris 31-Mar-2005 [175x3]	Not quite sure what to make of the following: >> rule: [set w 'pubDate (print w)] == [set w 'pubDate (print w)] >> parse [pubdate] rule pubdate == true >> parse/case [pubdate] rule pubdate == true
	First off, would the last result be a bug?
	Secondly, I'd like to ensure that whether the block is [pubdate] or [pubDate] that 'w stores 'pubDate. I had hoped that as 'pubDate is set in the rule, it might take precedence over pubdate in the block :^(
DideC 1-Apr-2005 [178]	I suppose /case only act on string!
Gabriele 1-Apr-2005 [179x3]	/case only applies to strings. Chris, you can:
	>> parse [pubdate] ['pubDate (w: 'pubDate print w)] pubDate == true
	but i'm not sure you'll like it.
Graham 9-Apr-2005 [182x2]	should be something easier than this like-i: charset [ #"1" #"l" #"L" #"I" #"i" ] like-a: charset [ #"a" #"A" #"@" ] like-v: charset [ #"\" #"/" #"v" #"V" ] cialis: [ "c" like-i like-a 2 like-i "s" ] viagra: [ 1 2 like-v like-i like-a "gr" like-a ] parse "\/[1-:-gr]@" [ viagra ] parse "[c1-:-Lls]" [ cialis ]
Graham 9-Apr-2005 [182x2]	hmm.. altme converts my double quote to a single quote
Gabriele 9-Apr-2005 [184]	maybe use charset "1lLIi" to avoid that much typing ;)
Anton 9-Apr-2005 [185]	Graham, the link width is slightly incorrect, so it obscures half of the double quote, so it looks like a single.
Tomc 28-Apr-2005 [186x4]	flatten: func [b [block!] /local flat][ flat: copy[] rule: [ some[ [x: block! (parse first :x rule)] \| [copy token any-type! (append flat token)] ] ] parse b rule flat ]
	without the recursive call to parse
	flatten: func [b [block!] /local flat rule x][ flat: copy[] rule: [some[[x: block! :x into rule] \| [copy token any-type! (append flat token)]]] parse b rule flat ]
	a flatten that changed it's block in place would be useful at times
Gregg 30-Apr-2005 [190]	Something like this? (it's not parse based though) flatten: func [block] [ head forall block [ if block? block/1 [change/part block block/1 1] ] ]
Robert 5-Jun-2005 [191x2]	I have a problem with parse not terminating the parsing. Here is my code for parsing CamelCase words: rebol [] ; CamelCase Test test-text: "FirstWord test. This is a CamelCase test Text. CamelCase2 is the base idea for a WiKi. CamelcasE" upper-case: charset "ABCDEFGHIJKLMNOPQRSTUVWXYZ" delimiters: charset " .,;\|^-^/" rest-chars: complement union upper-case delimiters text: "" parse/all/case test-text [ some [ copy camelcase-word [upper-case some rest-chars upper-case any rest-chars] ( if not empty? text [?? text clear text] print ["CamelCase word found:" camelcase-word] ) \| copy flowtext [any [rest-chars \| upper-case] any delimiters] ( append text flowtext ) ] ] halt
Robert 5-Jun-2005 [191x2]	Any idea why parse doesn't return?
sqlab 5-Jun-2005 [193]	[any [rest-chars \| upper-case] any delimiters] is always true, even if there is no char left at the end. But it does not move the cursor.
older newer	first last