World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Graham 20-Mar-2005 [133] | actually I use this: parse msg [copy header thru {^M^/^M^/} copy body to end] |
Tomc 20-Mar-2005 [134] | the last line matching rule in header-rule should be | to newline |
Graham 20-Mar-2005 [135] | sorry? |
Tomc 20-Mar-2005 [136x2] | to not break out of the rules before you reach the end of the header |
if you came accross a novel header line before you came across the To: line you would not get to the To: line | |
Graham 20-Mar-2005 [138x2] | header-rule: [ thru "Date:" copy m-date to newline | thru "From:" copy m-from to newline | thru "Subject:" copy m-subject to newline | thru "To:" copy m-to to newline ] m-subject: m-date: m-from: m-to: none parse header [header-rule some [ thru "^/" header-rule]] |
that's what I have at present ... | |
Tomc 20-Mar-2005 [140] | just making it explicit |
Graham 20-Mar-2005 [141x2] | I should remove those "thru"s I've got there. |
this header-rule should now be applied each time I get a "^/" ... | |
Tomc 20-Mar-2005 [143] | and if you had a header with a line that did not begin with 'Date, From, Subject or To then you could prematurely break out of header-rule before you got all your bits |
Graham 20-Mar-2005 [144] | How ? |
Vincent 20-Mar-2005 [145] | header-rule: [ "Date:" copy m-date to newline | "From:" copy m-from to newline | "Subject:" copy m-subject to newline | "To:" copy m-to to newline | to newline ] m-subject: m-date: m-from: m-to: none parse header [header-rule some [ thru "^/" header-rule]] |
Graham 20-Mar-2005 [146] | oh, I see ... |
Vincent 20-Mar-2005 [147] | else Date: ... X-Something: ... ; break the rule To: ... From: ... |
Brett 20-Mar-2005 [148] | If you are testing "^/" I would think that you need to use parse/all. You may find my script helpful for visualising the effect of your rules: http://www.rebol.org/cgi-bin/cgiwrap/rebol/documentation.r?script=parse-analysis-view.r |
Vincent 20-Mar-2005 [149] | oops - you're right, I missed the big one. |
Graham 20-Mar-2005 [150] | so, is PCRE easier to understand ?? |
Tomc 20-Mar-2005 [151] | $&^*&#&%(*_&$*@#@ |
Graham 20-Mar-2005 [152] | looks like perl |
Tomc 20-Mar-2005 [153] | that is just random chars not a pcre for paesing mail headers |
Graham 20-Mar-2005 [154x2] | Oh :) |
I was just attempting to bring the subject back on topic before I interrupted it. | |
Tomc 20-Mar-2005 [156] | that was not an interuprion , more liks exactly what this group is for |
Graham 20-Mar-2005 [157] | since I have no idea what pcre was .. |
Tomc 20-Mar-2005 [158x5] | . match any sigle char but newline |
* 0 or more of the precedding | |
() pit in var $n [n1,2,3 ...] | |
/T0: (.*) | |
$1 has to whom the email is addressed | |
Graham 20-Mar-2005 [163] | While we're here .. what this taint thing that Perl has, and is it a concern for Rebol ? |
Tomc 20-Mar-2005 [164] | tainting forces you to consider the users input and explicitly allow it to pass |
Anton 20-Mar-2005 [165] | I think only people who miss it want it. :) |
BrianW 20-Mar-2005 [166] | Taint mode tells Perl that you aren't sure whether your incoming data is safe. It's just a shortcut for enforcing commonsense programming. |
Graham 20-Mar-2005 [167] | so, it's to prevent incoming data being executed ? |
Tomc 20-Mar-2005 [168x2] | you can write a well considered script without taint that is far more secure than a script that passes taint mode by making a simple rule that does not properly catch problems |
you basicky have to weite a regular expression to accept user input | |
Vincent 20-Mar-2005 [170] | Graham: for your header, like Brett said, parse/all is needed when you work on strings with newlines and spaces. last line should be: parse/all header [header-rule some [ thru "^/" header-rule]] |
BrianW 20-Mar-2005 [171] | Graham, yes, but it's also used in other situations: force the programmer to escape HTML input before printing it back out, massaging data so that it's friendlier for the database, etc. |
Graham 20-Mar-2005 [172] | Yeah, I got that Vincent. Curiously though it has worked without it. |
Tomc 20-Mar-2005 [173x2] | in your example having a rule more like header-rule: [ "Date:" copy date-rule | "From:" copy email-rule | "Subject:" copy some alpha-num | "To:" copy email-rule | to newline ] where email-rule only matched email addresses would more taint like |
and being very careful to never effectivly do [ user-input] without being sure user-input could not cause unintended side effectd | |
Chris 31-Mar-2005 [175x3] | Not quite sure what to make of the following: >> rule: [set w 'pubDate (print w)] == [set w 'pubDate (print w)] >> parse [pubdate] rule pubdate == true >> parse/case [pubdate] rule pubdate == true |
First off, would the last result be a bug? | |
Secondly, I'd like to ensure that whether the block is [pubdate] or [pubDate] that 'w stores 'pubDate. I had hoped that as 'pubDate is set in the rule, it might take precedence over pubdate in the block :^( | |
DideC 1-Apr-2005 [178] | I suppose /case only act on string! |
Gabriele 1-Apr-2005 [179x3] | /case only applies to strings. Chris, you can: |
>> parse [pubdate] ['pubDate (w: 'pubDate print w)] pubDate == true | |
but i'm not sure you'll like it. | |
Graham 9-Apr-2005 [182] | should be something easier than this like-i: charset [ #"1" #"l" #"L" #"I" #"i" ] like-a: charset [ #"a" #"A" #"@" ] like-v: charset [ #"\" #"/" #"v" #"V" ] cialis: [ "c" like-i like-a 2 like-i "s" ] viagra: [ 1 2 like-v like-i like-a "gr" like-a ] parse "\/[1-:-gr]@" [ viagra ] parse "[c1-:-Lls]" [ cialis ] |
older newer | first last |