World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Graham 20-Mar-2005 [144] | How ? |
Vincent 20-Mar-2005 [145] | header-rule: [ "Date:" copy m-date to newline | "From:" copy m-from to newline | "Subject:" copy m-subject to newline | "To:" copy m-to to newline | to newline ] m-subject: m-date: m-from: m-to: none parse header [header-rule some [ thru "^/" header-rule]] |
Graham 20-Mar-2005 [146] | oh, I see ... |
Vincent 20-Mar-2005 [147] | else Date: ... X-Something: ... ; break the rule To: ... From: ... |
Brett 20-Mar-2005 [148] | If you are testing "^/" I would think that you need to use parse/all. You may find my script helpful for visualising the effect of your rules: http://www.rebol.org/cgi-bin/cgiwrap/rebol/documentation.r?script=parse-analysis-view.r |
Vincent 20-Mar-2005 [149] | oops - you're right, I missed the big one. |
Graham 20-Mar-2005 [150] | so, is PCRE easier to understand ?? |
Tomc 20-Mar-2005 [151] | $&^*&#&%(*_&$*@#@ |
Graham 20-Mar-2005 [152] | looks like perl |
Tomc 20-Mar-2005 [153] | that is just random chars not a pcre for paesing mail headers |
Graham 20-Mar-2005 [154x2] | Oh :) |
I was just attempting to bring the subject back on topic before I interrupted it. | |
Tomc 20-Mar-2005 [156] | that was not an interuprion , more liks exactly what this group is for |
Graham 20-Mar-2005 [157] | since I have no idea what pcre was .. |
Tomc 20-Mar-2005 [158x5] | . match any sigle char but newline |
* 0 or more of the precedding | |
() pit in var $n [n1,2,3 ...] | |
/T0: (.*) | |
$1 has to whom the email is addressed | |
Graham 20-Mar-2005 [163] | While we're here .. what this taint thing that Perl has, and is it a concern for Rebol ? |
Tomc 20-Mar-2005 [164] | tainting forces you to consider the users input and explicitly allow it to pass |
Anton 20-Mar-2005 [165] | I think only people who miss it want it. :) |
BrianW 20-Mar-2005 [166] | Taint mode tells Perl that you aren't sure whether your incoming data is safe. It's just a shortcut for enforcing commonsense programming. |
Graham 20-Mar-2005 [167] | so, it's to prevent incoming data being executed ? |
Tomc 20-Mar-2005 [168x2] | you can write a well considered script without taint that is far more secure than a script that passes taint mode by making a simple rule that does not properly catch problems |
you basicky have to weite a regular expression to accept user input | |
Vincent 20-Mar-2005 [170] | Graham: for your header, like Brett said, parse/all is needed when you work on strings with newlines and spaces. last line should be: parse/all header [header-rule some [ thru "^/" header-rule]] |
BrianW 20-Mar-2005 [171] | Graham, yes, but it's also used in other situations: force the programmer to escape HTML input before printing it back out, massaging data so that it's friendlier for the database, etc. |
Graham 20-Mar-2005 [172] | Yeah, I got that Vincent. Curiously though it has worked without it. |
Tomc 20-Mar-2005 [173x2] | in your example having a rule more like header-rule: [ "Date:" copy date-rule | "From:" copy email-rule | "Subject:" copy some alpha-num | "To:" copy email-rule | to newline ] where email-rule only matched email addresses would more taint like |
and being very careful to never effectivly do [ user-input] without being sure user-input could not cause unintended side effectd | |
Chris 31-Mar-2005 [175x3] | Not quite sure what to make of the following: >> rule: [set w 'pubDate (print w)] == [set w 'pubDate (print w)] >> parse [pubdate] rule pubdate == true >> parse/case [pubdate] rule pubdate == true |
First off, would the last result be a bug? | |
Secondly, I'd like to ensure that whether the block is [pubdate] or [pubDate] that 'w stores 'pubDate. I had hoped that as 'pubDate is set in the rule, it might take precedence over pubdate in the block :^( | |
DideC 1-Apr-2005 [178] | I suppose /case only act on string! |
Gabriele 1-Apr-2005 [179x3] | /case only applies to strings. Chris, you can: |
>> parse [pubdate] ['pubDate (w: 'pubDate print w)] pubDate == true | |
but i'm not sure you'll like it. | |
Graham 9-Apr-2005 [182x2] | should be something easier than this like-i: charset [ #"1" #"l" #"L" #"I" #"i" ] like-a: charset [ #"a" #"A" #"@" ] like-v: charset [ #"\" #"/" #"v" #"V" ] cialis: [ "c" like-i like-a 2 like-i "s" ] viagra: [ 1 2 like-v like-i like-a "gr" like-a ] parse "\/[1-:-gr]@" [ viagra ] parse "[c1-:-Lls]" [ cialis ] |
hmm.. altme converts my double quote to a single quote | |
Gabriele 9-Apr-2005 [184] | maybe use charset "1lLIi" to avoid that much typing ;) |
Anton 9-Apr-2005 [185] | Graham, the link width is slightly incorrect, so it obscures half of the double quote, so it looks like a single. |
Tomc 28-Apr-2005 [186x4] | flatten: func [b [block!] /local flat][ flat: copy[] rule: [ some[ [x: block! (parse first :x rule)] | [copy token any-type! (append flat token)] ] ] parse b rule flat ] |
without the recursive call to parse | |
flatten: func [b [block!] /local flat rule x][ flat: copy[] rule: [some[[x: block! :x into rule] | [copy token any-type! (append flat token)]]] parse b rule flat ] | |
a flatten that changed it's block in place would be useful at times | |
Gregg 30-Apr-2005 [190] | Something like this? (it's not parse based though) flatten: func [block] [ head forall block [ if block? block/1 [change/part block block/1 1] ] ] |
Robert 5-Jun-2005 [191x2] | I have a problem with parse not terminating the parsing. Here is my code for parsing CamelCase words: rebol [] ; CamelCase Test test-text: "FirstWord test. This is a CamelCase test Text. CamelCase2 is the base idea for a WiKi. CamelcasE" upper-case: charset "ABCDEFGHIJKLMNOPQRSTUVWXYZ" delimiters: charset " .,;|^-^/" rest-chars: complement union upper-case delimiters text: "" parse/all/case test-text [ some [ copy camelcase-word [upper-case some rest-chars upper-case any rest-chars] ( if not empty? text [?? text clear text] print ["CamelCase word found:" camelcase-word] ) | copy flowtext [any [rest-chars | upper-case] any delimiters] ( append text flowtext ) ] ] halt |
Any idea why parse doesn't return? | |
sqlab 5-Jun-2005 [193] | [any [rest-chars | upper-case] any delimiters] is always true, even if there is no char left at the end. But it does not move the cursor. |
older newer | first last |