World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
Graham 20-Mar-2005 [133]	actually I use this: parse msg [copy header thru {^M^/^M^/} copy body to end]
Tomc 20-Mar-2005 [134]	the last line matching rule in header-rule should be \| to newline
Graham 20-Mar-2005 [135]	sorry?
Tomc 20-Mar-2005 [136x2]	to not break out of the rules before you reach the end of the header
Tomc 20-Mar-2005 [136x2]	if you came accross a novel header line before you came across the To: line you would not get to the To: line
Graham 20-Mar-2005 [138x2]	header-rule: [ thru "Date:" copy m-date to newline \| thru "From:" copy m-from to newline \| thru "Subject:" copy m-subject to newline \| thru "To:" copy m-to to newline ] m-subject: m-date: m-from: m-to: none parse header [header-rule some [ thru "^/" header-rule]]
Graham 20-Mar-2005 [138x2]	that's what I have at present ...
Tomc 20-Mar-2005 [140]	just making it explicit
Graham 20-Mar-2005 [141x2]	I should remove those "thru"s I've got there.
Graham 20-Mar-2005 [141x2]	this header-rule should now be applied each time I get a "^/" ...
Tomc 20-Mar-2005 [143]	and if you had a header with a line that did not begin with 'Date, From, Subject or To then you could prematurely break out of header-rule before you got all your bits
Graham 20-Mar-2005 [144]	How ?
Vincent 20-Mar-2005 [145]	header-rule: [ "Date:" copy m-date to newline \| "From:" copy m-from to newline \| "Subject:" copy m-subject to newline \| "To:" copy m-to to newline \| to newline ] m-subject: m-date: m-from: m-to: none parse header [header-rule some [ thru "^/" header-rule]]
Graham 20-Mar-2005 [146]	oh, I see ...
Vincent 20-Mar-2005 [147]	else Date: ... X-Something: ... ; break the rule To: ... From: ...
Brett 20-Mar-2005 [148]	If you are testing "^/" I would think that you need to use parse/all. You may find my script helpful for visualising the effect of your rules: http://www.rebol.org/cgi-bin/cgiwrap/rebol/documentation.r?script=parse-analysis-view.r
Vincent 20-Mar-2005 [149]	oops - you're right, I missed the big one.
Graham 20-Mar-2005 [150]	so, is PCRE easier to understand ??
Tomc 20-Mar-2005 [151]	$&^&#&%(_&$*@#@
Graham 20-Mar-2005 [152]	looks like perl
Tomc 20-Mar-2005 [153]	that is just random chars not a pcre for paesing mail headers
Graham 20-Mar-2005 [154x2]	Oh :)
Graham 20-Mar-2005 [154x2]	I was just attempting to bring the subject back on topic before I interrupted it.
Tomc 20-Mar-2005 [156]	that was not an interuprion , more liks exactly what this group is for
Graham 20-Mar-2005 [157]	since I have no idea what pcre was ..
Tomc 20-Mar-2005 [158x5]	. match any sigle char but newline
	* 0 or more of the precedding
	() pit in var $n [n1,2,3 ...]
	/T0: (.*)
	$1 has to whom the email is addressed
Graham 20-Mar-2005 [163]	While we're here .. what this taint thing that Perl has, and is it a concern for Rebol ?
Tomc 20-Mar-2005 [164]	tainting forces you to consider the users input and explicitly allow it to pass
Anton 20-Mar-2005 [165]	I think only people who miss it want it. :)
BrianW 20-Mar-2005 [166]	Taint mode tells Perl that you aren't sure whether your incoming data is safe. It's just a shortcut for enforcing commonsense programming.
Graham 20-Mar-2005 [167]	so, it's to prevent incoming data being executed ?
Tomc 20-Mar-2005 [168x2]	you can write a well considered script without taint that is far more secure than a script that passes taint mode by making a simple rule that does not properly catch problems
Tomc 20-Mar-2005 [168x2]	you basicky have to weite a regular expression to accept user input
Vincent 20-Mar-2005 [170]	Graham: for your header, like Brett said, parse/all is needed when you work on strings with newlines and spaces. last line should be: parse/all header [header-rule some [ thru "^/" header-rule]]
BrianW 20-Mar-2005 [171]	Graham, yes, but it's also used in other situations: force the programmer to escape HTML input before printing it back out, massaging data so that it's friendlier for the database, etc.
Graham 20-Mar-2005 [172]	Yeah, I got that Vincent. Curiously though it has worked without it.
Tomc 20-Mar-2005 [173x2]	in your example having a rule more like header-rule: [ "Date:" copy date-rule \| "From:" copy email-rule \| "Subject:" copy some alpha-num \| "To:" copy email-rule \| to newline ] where email-rule only matched email addresses would more taint like
Tomc 20-Mar-2005 [173x2]	and being very careful to never effectivly do [ user-input] without being sure user-input could not cause unintended side effectd
Chris 31-Mar-2005 [175x3]	Not quite sure what to make of the following: >> rule: [set w 'pubDate (print w)] == [set w 'pubDate (print w)] >> parse [pubdate] rule pubdate == true >> parse/case [pubdate] rule pubdate == true
	First off, would the last result be a bug?
	Secondly, I'd like to ensure that whether the block is [pubdate] or [pubDate] that 'w stores 'pubDate. I had hoped that as 'pubDate is set in the rule, it might take precedence over pubdate in the block :^(
DideC 1-Apr-2005 [178]	I suppose /case only act on string!
Gabriele 1-Apr-2005 [179x3]	/case only applies to strings. Chris, you can:
	>> parse [pubdate] ['pubDate (w: 'pubDate print w)] pubDate == true
	but i'm not sure you'll like it.
Graham 9-Apr-2005 [182]	should be something easier than this like-i: charset [ #"1" #"l" #"L" #"I" #"i" ] like-a: charset [ #"a" #"A" #"@" ] like-v: charset [ #"\" #"/" #"v" #"V" ] cialis: [ "c" like-i like-a 2 like-i "s" ] viagra: [ 1 2 like-v like-i like-a "gr" like-a ] parse "\/[1-:-gr]@" [ viagra ] parse "[c1-:-Lls]" [ cialis ]
older newer	first last