World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
Graham 9-Apr-2005 [183]	hmm.. altme converts my double quote to a single quote
Gabriele 9-Apr-2005 [184]	maybe use charset "1lLIi" to avoid that much typing ;)
Anton 9-Apr-2005 [185]	Graham, the link width is slightly incorrect, so it obscures half of the double quote, so it looks like a single.
Tomc 28-Apr-2005 [186x4]	flatten: func [b [block!] /local flat][ flat: copy[] rule: [ some[ [x: block! (parse first :x rule)] \| [copy token any-type! (append flat token)] ] ] parse b rule flat ]
	without the recursive call to parse
	flatten: func [b [block!] /local flat rule x][ flat: copy[] rule: [some[[x: block! :x into rule] \| [copy token any-type! (append flat token)]]] parse b rule flat ]
	a flatten that changed it's block in place would be useful at times
Gregg 30-Apr-2005 [190]	Something like this? (it's not parse based though) flatten: func [block] [ head forall block [ if block? block/1 [change/part block block/1 1] ] ]
Robert 5-Jun-2005 [191x2]	I have a problem with parse not terminating the parsing. Here is my code for parsing CamelCase words: rebol [] ; CamelCase Test test-text: "FirstWord test. This is a CamelCase test Text. CamelCase2 is the base idea for a WiKi. CamelcasE" upper-case: charset "ABCDEFGHIJKLMNOPQRSTUVWXYZ" delimiters: charset " .,;\|^-^/" rest-chars: complement union upper-case delimiters text: "" parse/all/case test-text [ some [ copy camelcase-word [upper-case some rest-chars upper-case any rest-chars] ( if not empty? text [?? text clear text] print ["CamelCase word found:" camelcase-word] ) \| copy flowtext [any [rest-chars \| upper-case] any delimiters] ( append text flowtext ) ] ] halt
Robert 5-Jun-2005 [191x2]	Any idea why parse doesn't return?
sqlab 5-Jun-2005 [193]	[any [rest-chars \| upper-case] any delimiters] is always true, even if there is no char left at the end. But it does not move the cursor.
Tomc 5-Jun-2005 [194]	rebol [] ; CamelCase Test test-text: "FirstWord test. This is a CamelCase test Text. CamelCase2 is the base idea for a WiKi. CamelcasE" upper-case: charset "ABCDEFGHIJKLMNOPQRSTUVWXYZ" delimiter: charset " .,;\|^-^/" rest-char: complement union upper-case delimiter text: copy "" camelcase-rule: [some [upper-case some rest-char upper-case any rest-char] delimiter] parse/all/case test-text[ some [ copy camelcase-word camelcase-rule (if not empty? text [?? text clear text] print ["CamelCase word found: " camelcase-word] ) \| copy flowtext upper-case (append text flowtext) \| copy flowtext[any [rest-char \| delimiter]] (append text flowtext) ] ] halt
Graham 5-Jun-2005 [195x2]	what about camelCAse?
Graham 5-Jun-2005 [195x2]	Personally I prefer the way mediawiki does it ... using [[ .. ]] ... instead of having strange cases in words
Tomc 5-Jun-2005 [197]	yes, I was also concerned about A1CamelCase but figured Robert just needed to get thru his first question first
Robert 6-Jun-2005 [198x2]	Thanks, for the fix. Sometimes it helps to get some distance by asking others :-))
Robert 6-Jun-2005 [198x2]	I like CamelCase words. Simple to remember and use. IIRC camelCAse is not a valid CamelCase word. But anyway, it depends how I teach my users :-))
Graham 6-Jun-2005 [200]	http://en.wikipedia.org/wiki/CamelCase... CamelCase is referred to UpperCamelCase, and camelCase is referred to as lowerCamelCase
Robert 6-Jun-2005 [201]	Tom, your example doesn't terminate, like mine. The thing IMO is that the last Word is a CamelCase word and the 'end condition is somehow missed. It nevery reaches the halt.
sqlab 6-Jun-2005 [202]	If you do not want to change the parse rules, you can just add if not flowtext [halt] before append text flowtext
Robert 6-Jun-2005 [203]	I can change the parse rules. This is just a test script, the rule needs to be included in a broader parsing engine. So, it must return TRUE.
Tomc 6-Jun-2005 [204]	Robet you also have to worry about YaBaDaBaDoCamelCases (even and odd) to get it to return true , figure out what is left when the outter most some finishes. parse ...[ some [ ... ] copy remenant to end ( print remenant) ] then make the your rule cpnsume the remenant ok if you don't care just put a to end there
sqlab 7-Jun-2005 [205x2]	You can either put your parse in a catch [] and throw a true if not flowtext or something like this parse/all/case test-text [ some [ copy camelcase-word [upper-case some rest-chars upper-case any rest-chars] ( if not empty? text [?? text clear text] print ["CamelCase word found:" camelcase-word] ) \| copy flowtext [some [rest-chars \| upper-case] any delimiters] ( append text flowtext ) \| copy flowtext [some delimiters] ( append text flowtext ) ] to end ]
sqlab 7-Jun-2005 [205x2]	addendum/corrected ] to end (if not empty? text [?? text])
MichaelAppelmans 7-Jun-2005 [207x2]	getting the following error when running Didec's delete email script against a mailbox with large number of emails (250+):internal limit reached: parse Near: [parse data maillist addr-list] Where: parse-mail-list
MichaelAppelmans 7-Jun-2005 [207x2]	is this a rebol internal limit of should i start debugging?
Graham 7-Jun-2005 [209x2]	probably a parse limitation
Graham 7-Jun-2005 [209x2]	I think I've opened up mailboxes with over 400 emails before using Cerebrus' mailbox manager with no problems
MichaelAppelmans 7-Jun-2005 [211]	oh well. thanks :)
Graham 7-Jun-2005 [212]	what you could do, is extract Didier's implementation of the TOP command, and then get the first line of each header in your mailbox. If it has the return-path set to <>, then note it in a list. When finished, go thru and issue deletes on all of those.
MichaelAppelmans 7-Jun-2005 [213]	thanks! I'll have a look at that.
Gabriele 7-Jun-2005 [214]	is the To: line very, very long? there's a recursion limit in the parser for the address list. since you are probably not interested in parsing the To: header, maybe you can disable it in import-email.
Robert 8-Jun-2005 [215x2]	Hmm... my parse still not termines the 'some part. I never reach the end. The problem is that the rest of the string is "" and this seems not to be handled.
Robert 8-Jun-2005 [215x2]	Ok, got it. Now it works.
MichaelAppelmans 9-Jun-2005 [217x2]	newby here: can anyone direct me to a sample of code which matches a pattern over multiple text lines? I need to process a 5MB text file and remove all patterns of multiple consecutive email address es eg. [foo-:-my-:-com]; [foou-:-you-:-net] except the multiple email address string spans mulitple lines. Thanks for any pointers
MichaelAppelmans 9-Jun-2005 [217x2]	and the multiple email address string occurs multiple times ;)
Brock 9-Jun-2005 [219x2]	Michael, this is going to be a very general response to your request. Review setting up parse rules and use something like... parse text any [rule1 \| rule2 \| rule3]
Brock 9-Jun-2005 [219x2]	Here's some Parse documentation from the Rebol/Core guide to get you started. Share your progress and questions, maybe a line of sample data or two and maybe I can be of more help.
Brock 10-Jun-2005 [221]	link would help!!! http://www.rebol.com/docs/core23/rebolcore-15.html
MichaelAppelmans 11-Jun-2005 [222]	thanks Brock.. I wound up doing it in Perl, as I'm more familiar with it's regex support. The problem always seems I'm in a hurry with crisis response which is not a good learning environment ;)
Volker 11-Jun-2005 [223]	Sometimes it helps to parse in two steps. a loop for each line-group and parsing that group seperately. becaus ethen 'to/'thru work better.
Ammon 16-Jun-2005 [224]	Can anyone give me some insight on how to use Brett's visual parse tools?
MichaelB 16-Jun-2005 [225]	Can somebody explain me why 'parse fails in the first case and returns true in the second case ? r: [any into ['a (print 'a)]] t: [[a][a][a]] print parse t r -> a a a false r: [any [into ['a (print 'a)]]] print parse t r -> a a a true In the second 'r(ule) the additional [ ] make it kind of explicit, but shouldn't return the first version true as well ? Am I forgeting something what 'parse "thinks" when looking at the first 'r(ules) ? Thanks for hints. :-)
Ladislav 16-Jun-2005 [226]	this really looks like violating the principle of the least surprise, I suggest you to submit it to Rambo
Romano 16-Jun-2005 [227]	MichaelB, It is a parse bug for me.
MichaelB 17-Jun-2005 [228]	As Ladislav suggested, I put it to Rambo. Thanks.
Pekr 22-Jun-2005 [229]	I have CSV file and I have trouble using parse one-liner. The case is, that I export tel. list from Lotus Notes, then I save it in Excel into .csv for rebol to run thru. I wanted to use: foreach line ln-tel-list [append result parse/all line ";"] ... and I expected all lines having 7 elements. However - once last column is missing, that row is incorrect, as rebol parse will not add empty "" at the end. That is imo a bug ...
Gabriele 22-Jun-2005 [230x2]	btw, why /all? shouldn't excel surround elements with spaces with quotes?
Gabriele 22-Jun-2005 [230x2]	anyway, is it really a problem that the last column is missing?
Pekr 22-Jun-2005 [232]	I used csv = semicolon separated values and no quotes in-there ...
older newer	first last