World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Chris 12-Mar-2005 [83x2] | ;I tend to use charsets as a way of skipping 'thru while looking for multiple possibilities: line-end: charset [#"^/" #"^M"] ; etc line: complement newlines parse m [copy header [some [some line line-end]]] |
a way of skipping -> "an alternative to" | |
Graham 12-Mar-2005 [85] | is it faster? |
Chris 12-Mar-2005 [86] | It's hard to compare when 'thru doesn't fork. |
Graham 12-Mar-2005 [87] | or is it slower ? |
Chris 12-Mar-2005 [88] | For what it's worth, my benchmarks show it to be quick, but my benchmarks tend to be crude... |
Graham 12-Mar-2005 [89] | line: complement line-end |
Chris 12-Mar-2005 [90] | That means any character that isn't a line-ending... |
Graham 12-Mar-2005 [91] | yes, you have comlement newlines |
Chris 12-Mar-2005 [92x2] | Sorry, changing names as I go :o) |
line-end: charset [#"^/" #"^M" #"^J"] line: complement line-end parse m [copy header [some [some line line-end]] to end] ; Note that 'line-end in the parse line should be replaced with permutations of what a line-ending can be, without describing any permutation of a double line-ending. | |
Brett 13-Mar-2005 [94] | Graham, I'd probably use parse/all rather than parse. Also don't forget the parse-header function and all the associated bug fixing work related to it in view 1.3 project. May or may not be of use to you. |
Anton 13-Mar-2005 [95x2] | I don't think you need to distribute the thru. eg: |
parse "..abab..." [".." copy header ["aa" | "bb" | "abab"] (?? header) to end] header: "abab" == true | |
Graham 13-Mar-2005 [97x2] | I'll give this one a go as well. |
But I wanted the line endings in the "header" variable as I send it on to the client. | |
Anton 14-Mar-2005 [99] | The second line is the print out from (?? header) |
Tomc 20-Mar-2005 [100] | Joe: What do you need a perl comatible regular expression to do? |
BrianW 20-Mar-2005 [101x2] | He could hand it off to developers that are familiar with pcre but not parse, for starters. |
It wouldn't have to be industrial-strength, but it would like a security blanket for developers experimenting with the new language. PCRE is found all over the place in languages on Linux machines, and the absence makes some developers uncomfortable - despite the fact that Parse is better. | |
Tomc 20-Mar-2005 [103] | yea but then rebol programs would start getting comtaminated with unfriendly gobbeldy gook and rebol developers would have to learn pcre |
BrianW 20-Mar-2005 [104] | Good point. One article I was thinking would be along the lines of a "phrasebook", translating PCRE concepts to Parse equivalents. |
Graham 20-Mar-2005 [105] | sometimes it is just is too hard to get parse working ...an alternative would be nice |
BrianW 20-Mar-2005 [106] | What about a parse rule that takes pcre strings as input and produces a parse rule as output? |
Graham 20-Mar-2005 [107] | I've got this rule to parse email headers which only works some of the time. header-rule: [ thru "^/Date:" copy m-date to newline | thru "^/From:" copy m-from to newline | thru "^/Subject:" copy m-subject to newline | thru "^/To:" copy m-to to newline | thru "^/Return-path: " ] m-subject: m-date: m-from: m-to: none parse header [some header-rule] |
Tomc 20-Mar-2005 [108x2] | I am not totatly against REs I use them all the time in shells, and having them built in would make writing "work alike" programs easier but over all , it seems to me like a step down |
(you can garuntee the order in which the header lines come? | |
BrianW 20-Mar-2005 [110] | No, order may vary. |
Graham 20-Mar-2005 [111] | no, that's why I use "some" |
Tomc 20-Mar-2005 [112] | but it will only work when order is the same |
Graham 20-Mar-2005 [113] | I was under the impression that it would keep applying the rule ... |
Tomc 20-Mar-2005 [114] | thru "^To:" is thru to even if you bypass other valid lined to get there |
BrianW 20-Mar-2005 [115] | So how would he say "Any of these in any order?" |
Vincent 20-Mar-2005 [116] | you should only go 'thru the common line start. try something like: header-rule: [ "Date:" copy ... to newline | "From:" copy ... to newline | ... ] parse header [some [thru "^/" header-rule] |
Tomc 20-Mar-2005 [117] | I will be a few to make concreat but basicly you work with what is common to all lines , in this case colons and newlines |
Graham 20-Mar-2005 [118] | Hmm. Works so far :) |
Vincent 20-Mar-2005 [119] | sorry, I missed a problem in this expression: the header must start with a newline, so parse header [header-rule some [thru "^/" header-rule]] is better |
Tomc 20-Mar-2005 [120x3] | might have to be careful if you want the first line |
ahh | |
got you caught it | |
Graham 20-Mar-2005 [123x3] | Ahh...well, invariablly the first line of the header is "Return-path:" so i'ts not a problem. |
invariable because if it's not there, I add it! | |
Thanks, I should asked much sooner rather than struggling with it. | |
Tomc 20-Mar-2005 [126] | what about when you get novel headders? do you care? |
Graham 20-Mar-2005 [127x3] | no, these are the ones I display when reading an email ... |
If the user requests full header display, I just show them raw. | |
I need the "^/... as nowadays, there's email coming thru with authentication signatures that contain the headers in a block | |
Tomc 20-Mar-2005 [130] | so once you have done some header-lines and got the ones you are interested in you skip the rest with thru "^/^/" |
Graham 20-Mar-2005 [131x2] | actually, I copy the header and body out first and process them separately. |
parse msg [ copy header thru {^/^} copy body to end ] | |
older newer | first last |