World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
BrianW 20-Mar-2005 [102]	It wouldn't have to be industrial-strength, but it would like a security blanket for developers experimenting with the new language. PCRE is found all over the place in languages on Linux machines, and the absence makes some developers uncomfortable - despite the fact that Parse is better.
Tomc 20-Mar-2005 [103]	yea but then rebol programs would start getting comtaminated with unfriendly gobbeldy gook and rebol developers would have to learn pcre
BrianW 20-Mar-2005 [104]	Good point. One article I was thinking would be along the lines of a "phrasebook", translating PCRE concepts to Parse equivalents.
Graham 20-Mar-2005 [105]	sometimes it is just is too hard to get parse working ...an alternative would be nice
BrianW 20-Mar-2005 [106]	What about a parse rule that takes pcre strings as input and produces a parse rule as output?
Graham 20-Mar-2005 [107]	I've got this rule to parse email headers which only works some of the time. header-rule: [ thru "^/Date:" copy m-date to newline \| thru "^/From:" copy m-from to newline \| thru "^/Subject:" copy m-subject to newline \| thru "^/To:" copy m-to to newline \| thru "^/Return-path: " ] m-subject: m-date: m-from: m-to: none parse header [some header-rule]
Tomc 20-Mar-2005 [108x2]	I am not totatly against REs I use them all the time in shells, and having them built in would make writing "work alike" programs easier but over all , it seems to me like a step down
Tomc 20-Mar-2005 [108x2]	(you can garuntee the order in which the header lines come?
BrianW 20-Mar-2005 [110]	No, order may vary.
Graham 20-Mar-2005 [111]	no, that's why I use "some"
Tomc 20-Mar-2005 [112]	but it will only work when order is the same
Graham 20-Mar-2005 [113]	I was under the impression that it would keep applying the rule ...
Tomc 20-Mar-2005 [114]	thru "^To:" is thru to even if you bypass other valid lined to get there
BrianW 20-Mar-2005 [115]	So how would he say "Any of these in any order?"
Vincent 20-Mar-2005 [116]	you should only go 'thru the common line start. try something like: header-rule: [ "Date:" copy ... to newline \| "From:" copy ... to newline \| ... ] parse header [some [thru "^/" header-rule]
Tomc 20-Mar-2005 [117]	I will be a few to make concreat but basicly you work with what is common to all lines , in this case colons and newlines
Graham 20-Mar-2005 [118]	Hmm. Works so far :)
Vincent 20-Mar-2005 [119]	sorry, I missed a problem in this expression: the header must start with a newline, so parse header [header-rule some [thru "^/" header-rule]] is better
Tomc 20-Mar-2005 [120x3]	might have to be careful if you want the first line
	ahh
	got you caught it
Graham 20-Mar-2005 [123x3]	Ahh...well, invariablly the first line of the header is "Return-path:" so i'ts not a problem.
	invariable because if it's not there, I add it!
	Thanks, I should asked much sooner rather than struggling with it.
Tomc 20-Mar-2005 [126]	what about when you get novel headders? do you care?
Graham 20-Mar-2005 [127x3]	no, these are the ones I display when reading an email ...
	If the user requests full header display, I just show them raw.
	I need the "^/... as nowadays, there's email coming thru with authentication signatures that contain the headers in a block
Tomc 20-Mar-2005 [130]	so once you have done some header-lines and got the ones you are interested in you skip the rest with thru "^/^/"
Graham 20-Mar-2005 [131x3]	actually, I copy the header and body out first and process them separately.
	parse msg [ copy header thru {^/^} copy body to end ]
	actually I use this: parse msg [copy header thru {^M^/^M^/} copy body to end]
Tomc 20-Mar-2005 [134]	the last line matching rule in header-rule should be \| to newline
Graham 20-Mar-2005 [135]	sorry?
Tomc 20-Mar-2005 [136x2]	to not break out of the rules before you reach the end of the header
Tomc 20-Mar-2005 [136x2]	if you came accross a novel header line before you came across the To: line you would not get to the To: line
Graham 20-Mar-2005 [138x2]	header-rule: [ thru "Date:" copy m-date to newline \| thru "From:" copy m-from to newline \| thru "Subject:" copy m-subject to newline \| thru "To:" copy m-to to newline ] m-subject: m-date: m-from: m-to: none parse header [header-rule some [ thru "^/" header-rule]]
Graham 20-Mar-2005 [138x2]	that's what I have at present ...
Tomc 20-Mar-2005 [140]	just making it explicit
Graham 20-Mar-2005 [141x2]	I should remove those "thru"s I've got there.
Graham 20-Mar-2005 [141x2]	this header-rule should now be applied each time I get a "^/" ...
Tomc 20-Mar-2005 [143]	and if you had a header with a line that did not begin with 'Date, From, Subject or To then you could prematurely break out of header-rule before you got all your bits
Graham 20-Mar-2005 [144]	How ?
Vincent 20-Mar-2005 [145]	header-rule: [ "Date:" copy m-date to newline \| "From:" copy m-from to newline \| "Subject:" copy m-subject to newline \| "To:" copy m-to to newline \| to newline ] m-subject: m-date: m-from: m-to: none parse header [header-rule some [ thru "^/" header-rule]]
Graham 20-Mar-2005 [146]	oh, I see ...
Vincent 20-Mar-2005 [147]	else Date: ... X-Something: ... ; break the rule To: ... From: ...
Brett 20-Mar-2005 [148]	If you are testing "^/" I would think that you need to use parse/all. You may find my script helpful for visualising the effect of your rules: http://www.rebol.org/cgi-bin/cgiwrap/rebol/documentation.r?script=parse-analysis-view.r
Vincent 20-Mar-2005 [149]	oops - you're right, I missed the big one.
Graham 20-Mar-2005 [150]	so, is PCRE easier to understand ??
Tomc 20-Mar-2005 [151]	$&^&#&%(_&$*@#@
older newer	first last