World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Tomc 20-Mar-2005 [103] | yea but then rebol programs would start getting comtaminated with unfriendly gobbeldy gook and rebol developers would have to learn pcre |
BrianW 20-Mar-2005 [104] | Good point. One article I was thinking would be along the lines of a "phrasebook", translating PCRE concepts to Parse equivalents. |
Graham 20-Mar-2005 [105] | sometimes it is just is too hard to get parse working ...an alternative would be nice |
BrianW 20-Mar-2005 [106] | What about a parse rule that takes pcre strings as input and produces a parse rule as output? |
Graham 20-Mar-2005 [107] | I've got this rule to parse email headers which only works some of the time. header-rule: [ thru "^/Date:" copy m-date to newline | thru "^/From:" copy m-from to newline | thru "^/Subject:" copy m-subject to newline | thru "^/To:" copy m-to to newline | thru "^/Return-path: " ] m-subject: m-date: m-from: m-to: none parse header [some header-rule] |
Tomc 20-Mar-2005 [108x2] | I am not totatly against REs I use them all the time in shells, and having them built in would make writing "work alike" programs easier but over all , it seems to me like a step down |
(you can garuntee the order in which the header lines come? | |
BrianW 20-Mar-2005 [110] | No, order may vary. |
Graham 20-Mar-2005 [111] | no, that's why I use "some" |
Tomc 20-Mar-2005 [112] | but it will only work when order is the same |
Graham 20-Mar-2005 [113] | I was under the impression that it would keep applying the rule ... |
Tomc 20-Mar-2005 [114] | thru "^To:" is thru to even if you bypass other valid lined to get there |
BrianW 20-Mar-2005 [115] | So how would he say "Any of these in any order?" |
Vincent 20-Mar-2005 [116] | you should only go 'thru the common line start. try something like: header-rule: [ "Date:" copy ... to newline | "From:" copy ... to newline | ... ] parse header [some [thru "^/" header-rule] |
Tomc 20-Mar-2005 [117] | I will be a few to make concreat but basicly you work with what is common to all lines , in this case colons and newlines |
Graham 20-Mar-2005 [118] | Hmm. Works so far :) |
Vincent 20-Mar-2005 [119] | sorry, I missed a problem in this expression: the header must start with a newline, so parse header [header-rule some [thru "^/" header-rule]] is better |
Tomc 20-Mar-2005 [120x3] | might have to be careful if you want the first line |
ahh | |
got you caught it | |
Graham 20-Mar-2005 [123x3] | Ahh...well, invariablly the first line of the header is "Return-path:" so i'ts not a problem. |
invariable because if it's not there, I add it! | |
Thanks, I should asked much sooner rather than struggling with it. | |
Tomc 20-Mar-2005 [126] | what about when you get novel headders? do you care? |
Graham 20-Mar-2005 [127x3] | no, these are the ones I display when reading an email ... |
If the user requests full header display, I just show them raw. | |
I need the "^/... as nowadays, there's email coming thru with authentication signatures that contain the headers in a block | |
Tomc 20-Mar-2005 [130] | so once you have done some header-lines and got the ones you are interested in you skip the rest with thru "^/^/" |
Graham 20-Mar-2005 [131x3] | actually, I copy the header and body out first and process them separately. |
parse msg [ copy header thru {^/^} copy body to end ] | |
actually I use this: parse msg [copy header thru {^M^/^M^/} copy body to end] | |
Tomc 20-Mar-2005 [134] | the last line matching rule in header-rule should be | to newline |
Graham 20-Mar-2005 [135] | sorry? |
Tomc 20-Mar-2005 [136x2] | to not break out of the rules before you reach the end of the header |
if you came accross a novel header line before you came across the To: line you would not get to the To: line | |
Graham 20-Mar-2005 [138x2] | header-rule: [ thru "Date:" copy m-date to newline | thru "From:" copy m-from to newline | thru "Subject:" copy m-subject to newline | thru "To:" copy m-to to newline ] m-subject: m-date: m-from: m-to: none parse header [header-rule some [ thru "^/" header-rule]] |
that's what I have at present ... | |
Tomc 20-Mar-2005 [140] | just making it explicit |
Graham 20-Mar-2005 [141x2] | I should remove those "thru"s I've got there. |
this header-rule should now be applied each time I get a "^/" ... | |
Tomc 20-Mar-2005 [143] | and if you had a header with a line that did not begin with 'Date, From, Subject or To then you could prematurely break out of header-rule before you got all your bits |
Graham 20-Mar-2005 [144] | How ? |
Vincent 20-Mar-2005 [145] | header-rule: [ "Date:" copy m-date to newline | "From:" copy m-from to newline | "Subject:" copy m-subject to newline | "To:" copy m-to to newline | to newline ] m-subject: m-date: m-from: m-to: none parse header [header-rule some [ thru "^/" header-rule]] |
Graham 20-Mar-2005 [146] | oh, I see ... |
Vincent 20-Mar-2005 [147] | else Date: ... X-Something: ... ; break the rule To: ... From: ... |
Brett 20-Mar-2005 [148] | If you are testing "^/" I would think that you need to use parse/all. You may find my script helpful for visualising the effect of your rules: http://www.rebol.org/cgi-bin/cgiwrap/rebol/documentation.r?script=parse-analysis-view.r |
Vincent 20-Mar-2005 [149] | oops - you're right, I missed the big one. |
Graham 20-Mar-2005 [150] | so, is PCRE easier to understand ?? |
Tomc 20-Mar-2005 [151] | $&^*&#&%(*_&$*@#@ |
Graham 20-Mar-2005 [152] | looks like perl |
older newer | first last |