World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Graham 9-Apr-2005 [183] | hmm.. altme converts my double quote to a single quote |
Gabriele 9-Apr-2005 [184] | maybe use charset "1lLIi" to avoid that much typing ;) |
Anton 9-Apr-2005 [185] | Graham, the link width is slightly incorrect, so it obscures half of the double quote, so it looks like a single. |
Tomc 28-Apr-2005 [186x4] | flatten: func [b [block!] /local flat][ flat: copy[] rule: [ some[ [x: block! (parse first :x rule)] | [copy token any-type! (append flat token)] ] ] parse b rule flat ] |
without the recursive call to parse | |
flatten: func [b [block!] /local flat rule x][ flat: copy[] rule: [some[[x: block! :x into rule] | [copy token any-type! (append flat token)]]] parse b rule flat ] | |
a flatten that changed it's block in place would be useful at times | |
Gregg 30-Apr-2005 [190] | Something like this? (it's not parse based though) flatten: func [block] [ head forall block [ if block? block/1 [change/part block block/1 1] ] ] |
Robert 5-Jun-2005 [191x2] | I have a problem with parse not terminating the parsing. Here is my code for parsing CamelCase words: rebol [] ; CamelCase Test test-text: "FirstWord test. This is a CamelCase test Text. CamelCase2 is the base idea for a WiKi. CamelcasE" upper-case: charset "ABCDEFGHIJKLMNOPQRSTUVWXYZ" delimiters: charset " .,;|^-^/" rest-chars: complement union upper-case delimiters text: "" parse/all/case test-text [ some [ copy camelcase-word [upper-case some rest-chars upper-case any rest-chars] ( if not empty? text [?? text clear text] print ["CamelCase word found:" camelcase-word] ) | copy flowtext [any [rest-chars | upper-case] any delimiters] ( append text flowtext ) ] ] halt |
Any idea why parse doesn't return? | |
sqlab 5-Jun-2005 [193] | [any [rest-chars | upper-case] any delimiters] is always true, even if there is no char left at the end. But it does not move the cursor. |
Tomc 5-Jun-2005 [194] | rebol [] ; CamelCase Test test-text: "FirstWord test. This is a CamelCase test Text. CamelCase2 is the base idea for a WiKi. CamelcasE" upper-case: charset "ABCDEFGHIJKLMNOPQRSTUVWXYZ" delimiter: charset " .,;|^-^/" rest-char: complement union upper-case delimiter text: copy "" camelcase-rule: [some [upper-case some rest-char upper-case any rest-char] delimiter] parse/all/case test-text[ some [ copy camelcase-word camelcase-rule (if not empty? text [?? text clear text] print ["CamelCase word found: " camelcase-word] ) | copy flowtext upper-case (append text flowtext) | copy flowtext[any [rest-char | delimiter]] (append text flowtext) ] ] halt |
Graham 5-Jun-2005 [195x2] | what about camelCAse? |
Personally I prefer the way mediawiki does it ... using [[ .. ]] ... instead of having strange cases in words | |
Tomc 5-Jun-2005 [197] | yes, I was also concerned about A1CamelCase but figured Robert just needed to get thru his first question first |
Robert 6-Jun-2005 [198x2] | Thanks, for the fix. Sometimes it helps to get some distance by asking others :-)) |
I like CamelCase words. Simple to remember and use. IIRC camelCAse is not a valid CamelCase word. But anyway, it depends how I teach my users :-)) | |
Graham 6-Jun-2005 [200] | http://en.wikipedia.org/wiki/CamelCase... CamelCase is referred to UpperCamelCase, and camelCase is referred to as lowerCamelCase |
Robert 6-Jun-2005 [201] | Tom, your example doesn't terminate, like mine. The thing IMO is that the last Word is a CamelCase word and the 'end condition is somehow missed. It nevery reaches the halt. |
sqlab 6-Jun-2005 [202] | If you do not want to change the parse rules, you can just add if not flowtext [halt] before append text flowtext |
Robert 6-Jun-2005 [203] | I can change the parse rules. This is just a test script, the rule needs to be included in a broader parsing engine. So, it must return TRUE. |
Tomc 6-Jun-2005 [204] | Robet you also have to worry about YaBaDaBaDoCamelCases (even and odd) to get it to return true , figure out what is left when the outter most some finishes. parse ...[ some [ ... ] copy remenant to end ( print remenant) ] then make the your rule cpnsume the remenant ok if you don't care just put a to end there |
sqlab 7-Jun-2005 [205x2] | You can either put your parse in a catch [] and throw a true if not flowtext or something like this parse/all/case test-text [ some [ copy camelcase-word [upper-case some rest-chars upper-case any rest-chars] ( if not empty? text [?? text clear text] print ["CamelCase word found:" camelcase-word] ) | copy flowtext [some [rest-chars | upper-case] any delimiters] ( append text flowtext ) | copy flowtext [some delimiters] ( append text flowtext ) ] to end ] |
addendum/corrected ] to end (if not empty? text [?? text]) | |
MichaelAppelmans 7-Jun-2005 [207x2] | getting the following error when running Didec's delete email script against a mailbox with large number of emails (250+):internal limit reached: parse Near: [parse data maillist addr-list] Where: parse-mail-list |
is this a rebol internal limit of should i start debugging? | |
Graham 7-Jun-2005 [209x2] | probably a parse limitation |
I think I've opened up mailboxes with over 400 emails before using Cerebrus' mailbox manager with no problems | |
MichaelAppelmans 7-Jun-2005 [211] | oh well. thanks :) |
Graham 7-Jun-2005 [212] | what you could do, is extract Didier's implementation of the TOP command, and then get the first line of each header in your mailbox. If it has the return-path set to <>, then note it in a list. When finished, go thru and issue deletes on all of those. |
MichaelAppelmans 7-Jun-2005 [213] | thanks! I'll have a look at that. |
Gabriele 7-Jun-2005 [214] | is the To: line very, very long? there's a recursion limit in the parser for the address list. since you are probably not interested in parsing the To: header, maybe you can disable it in import-email. |
Robert 8-Jun-2005 [215x2] | Hmm... my parse still not termines the 'some part. I never reach the end. The problem is that the rest of the string is "" and this seems not to be handled. |
Ok, got it. Now it works. | |
MichaelAppelmans 9-Jun-2005 [217x2] | newby here: can anyone direct me to a sample of code which matches a pattern over multiple text lines? I need to process a 5MB text file and remove all patterns of multiple consecutive email address es eg. [foo-:-my-:-com]; [foou-:-you-:-net] except the multiple email address string spans mulitple lines. Thanks for any pointers |
and the multiple email address string occurs multiple times ;) | |
Brock 9-Jun-2005 [219x2] | Michael, this is going to be a very general response to your request. Review setting up parse rules and use something like... parse text any [rule1 | rule2 | rule3] |
Here's some Parse documentation from the Rebol/Core guide to get you started. Share your progress and questions, maybe a line of sample data or two and maybe I can be of more help. | |
Brock 10-Jun-2005 [221] | link would help!!! http://www.rebol.com/docs/core23/rebolcore-15.html |
MichaelAppelmans 11-Jun-2005 [222] | thanks Brock.. I wound up doing it in Perl, as I'm more familiar with it's regex support. The problem always seems I'm in a hurry with crisis response which is not a good learning environment ;) |
Volker 11-Jun-2005 [223] | Sometimes it helps to parse in two steps. a loop for each line-group and parsing that group seperately. becaus ethen 'to/'thru work better. |
Ammon 16-Jun-2005 [224] | Can anyone give me some insight on how to use Brett's visual parse tools? |
MichaelB 16-Jun-2005 [225] | Can somebody explain me why 'parse fails in the first case and returns true in the second case ? r: [any into ['a (print 'a)]] t: [[a][a][a]] print parse t r -> a a a false r: [any [into ['a (print 'a)]]] print parse t r -> a a a true In the second 'r(ule) the additional [ ] make it kind of explicit, but shouldn't return the first version true as well ? Am I forgeting something what 'parse "thinks" when looking at the first 'r(ules) ? Thanks for hints. :-) |
Ladislav 16-Jun-2005 [226] | this really looks like violating the principle of the least surprise, I suggest you to submit it to Rambo |
Romano 16-Jun-2005 [227] | MichaelB, It is a parse bug for me. |
MichaelB 17-Jun-2005 [228] | As Ladislav suggested, I put it to Rambo. Thanks. |
Pekr 22-Jun-2005 [229] | I have CSV file and I have trouble using parse one-liner. The case is, that I export tel. list from Lotus Notes, then I save it in Excel into .csv for rebol to run thru. I wanted to use: foreach line ln-tel-list [append result parse/all line ";"] ... and I expected all lines having 7 elements. However - once last column is missing, that row is incorrect, as rebol parse will not add empty "" at the end. That is imo a bug ... |
Gabriele 22-Jun-2005 [230x2] | btw, why /all? shouldn't excel surround elements with spaces with quotes? |
anyway, is it really a problem that the last column is missing? | |
Pekr 22-Jun-2005 [232] | I used csv = semicolon separated values and no quotes in-there ... |
older newer | first last |