World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
Tomc 5-Jun-2005 [194]	rebol [] ; CamelCase Test test-text: "FirstWord test. This is a CamelCase test Text. CamelCase2 is the base idea for a WiKi. CamelcasE" upper-case: charset "ABCDEFGHIJKLMNOPQRSTUVWXYZ" delimiter: charset " .,;\|^-^/" rest-char: complement union upper-case delimiter text: copy "" camelcase-rule: [some [upper-case some rest-char upper-case any rest-char] delimiter] parse/all/case test-text[ some [ copy camelcase-word camelcase-rule (if not empty? text [?? text clear text] print ["CamelCase word found: " camelcase-word] ) \| copy flowtext upper-case (append text flowtext) \| copy flowtext[any [rest-char \| delimiter]] (append text flowtext) ] ] halt
Graham 5-Jun-2005 [195x2]	what about camelCAse?
Graham 5-Jun-2005 [195x2]	Personally I prefer the way mediawiki does it ... using [[ .. ]] ... instead of having strange cases in words
Tomc 5-Jun-2005 [197]	yes, I was also concerned about A1CamelCase but figured Robert just needed to get thru his first question first
Robert 6-Jun-2005 [198x2]	Thanks, for the fix. Sometimes it helps to get some distance by asking others :-))
Robert 6-Jun-2005 [198x2]	I like CamelCase words. Simple to remember and use. IIRC camelCAse is not a valid CamelCase word. But anyway, it depends how I teach my users :-))
Graham 6-Jun-2005 [200]	http://en.wikipedia.org/wiki/CamelCase... CamelCase is referred to UpperCamelCase, and camelCase is referred to as lowerCamelCase
Robert 6-Jun-2005 [201]	Tom, your example doesn't terminate, like mine. The thing IMO is that the last Word is a CamelCase word and the 'end condition is somehow missed. It nevery reaches the halt.
sqlab 6-Jun-2005 [202]	If you do not want to change the parse rules, you can just add if not flowtext [halt] before append text flowtext
Robert 6-Jun-2005 [203]	I can change the parse rules. This is just a test script, the rule needs to be included in a broader parsing engine. So, it must return TRUE.
Tomc 6-Jun-2005 [204]	Robet you also have to worry about YaBaDaBaDoCamelCases (even and odd) to get it to return true , figure out what is left when the outter most some finishes. parse ...[ some [ ... ] copy remenant to end ( print remenant) ] then make the your rule cpnsume the remenant ok if you don't care just put a to end there
sqlab 7-Jun-2005 [205x2]	You can either put your parse in a catch [] and throw a true if not flowtext or something like this parse/all/case test-text [ some [ copy camelcase-word [upper-case some rest-chars upper-case any rest-chars] ( if not empty? text [?? text clear text] print ["CamelCase word found:" camelcase-word] ) \| copy flowtext [some [rest-chars \| upper-case] any delimiters] ( append text flowtext ) \| copy flowtext [some delimiters] ( append text flowtext ) ] to end ]
sqlab 7-Jun-2005 [205x2]	addendum/corrected ] to end (if not empty? text [?? text])
MichaelAppelmans 7-Jun-2005 [207x2]	getting the following error when running Didec's delete email script against a mailbox with large number of emails (250+):internal limit reached: parse Near: [parse data maillist addr-list] Where: parse-mail-list
MichaelAppelmans 7-Jun-2005 [207x2]	is this a rebol internal limit of should i start debugging?
Graham 7-Jun-2005 [209x2]	probably a parse limitation
Graham 7-Jun-2005 [209x2]	I think I've opened up mailboxes with over 400 emails before using Cerebrus' mailbox manager with no problems
MichaelAppelmans 7-Jun-2005 [211]	oh well. thanks :)
Graham 7-Jun-2005 [212]	what you could do, is extract Didier's implementation of the TOP command, and then get the first line of each header in your mailbox. If it has the return-path set to <>, then note it in a list. When finished, go thru and issue deletes on all of those.
MichaelAppelmans 7-Jun-2005 [213]	thanks! I'll have a look at that.
Gabriele 7-Jun-2005 [214]	is the To: line very, very long? there's a recursion limit in the parser for the address list. since you are probably not interested in parsing the To: header, maybe you can disable it in import-email.
Robert 8-Jun-2005 [215x2]	Hmm... my parse still not termines the 'some part. I never reach the end. The problem is that the rest of the string is "" and this seems not to be handled.
Robert 8-Jun-2005 [215x2]	Ok, got it. Now it works.
MichaelAppelmans 9-Jun-2005 [217x2]	newby here: can anyone direct me to a sample of code which matches a pattern over multiple text lines? I need to process a 5MB text file and remove all patterns of multiple consecutive email address es eg. [foo-:-my-:-com]; [foou-:-you-:-net] except the multiple email address string spans mulitple lines. Thanks for any pointers
MichaelAppelmans 9-Jun-2005 [217x2]	and the multiple email address string occurs multiple times ;)
Brock 9-Jun-2005 [219x2]	Michael, this is going to be a very general response to your request. Review setting up parse rules and use something like... parse text any [rule1 \| rule2 \| rule3]
Brock 9-Jun-2005 [219x2]	Here's some Parse documentation from the Rebol/Core guide to get you started. Share your progress and questions, maybe a line of sample data or two and maybe I can be of more help.
Brock 10-Jun-2005 [221]	link would help!!! http://www.rebol.com/docs/core23/rebolcore-15.html
MichaelAppelmans 11-Jun-2005 [222]	thanks Brock.. I wound up doing it in Perl, as I'm more familiar with it's regex support. The problem always seems I'm in a hurry with crisis response which is not a good learning environment ;)
Volker 11-Jun-2005 [223]	Sometimes it helps to parse in two steps. a loop for each line-group and parsing that group seperately. becaus ethen 'to/'thru work better.
Ammon 16-Jun-2005 [224]	Can anyone give me some insight on how to use Brett's visual parse tools?
MichaelB 16-Jun-2005 [225]	Can somebody explain me why 'parse fails in the first case and returns true in the second case ? r: [any into ['a (print 'a)]] t: [[a][a][a]] print parse t r -> a a a false r: [any [into ['a (print 'a)]]] print parse t r -> a a a true In the second 'r(ule) the additional [ ] make it kind of explicit, but shouldn't return the first version true as well ? Am I forgeting something what 'parse "thinks" when looking at the first 'r(ules) ? Thanks for hints. :-)
Ladislav 16-Jun-2005 [226]	this really looks like violating the principle of the least surprise, I suggest you to submit it to Rambo
Romano 16-Jun-2005 [227]	MichaelB, It is a parse bug for me.
MichaelB 17-Jun-2005 [228]	As Ladislav suggested, I put it to Rambo. Thanks.
Pekr 22-Jun-2005 [229]	I have CSV file and I have trouble using parse one-liner. The case is, that I export tel. list from Lotus Notes, then I save it in Excel into .csv for rebol to run thru. I wanted to use: foreach line ln-tel-list [append result parse/all line ";"] ... and I expected all lines having 7 elements. However - once last column is missing, that row is incorrect, as rebol parse will not add empty "" at the end. That is imo a bug ...
Gabriele 22-Jun-2005 [230x2]	btw, why /all? shouldn't excel surround elements with spaces with quotes?
Gabriele 22-Jun-2005 [230x2]	anyway, is it really a problem that the last column is missing?
Pekr 22-Jun-2005 [232x3]	I used csv = semicolon separated values and no quotes in-there ...
	yes, it is, as I expect all lines having 7 elements ... once there is not such an element, I can't loop thru result ... well, one condition will probably solve it, but imo it is a gug .... rebol identifies ;; and puts "" inthere, but csv, at the end, will use "value;", and rebol does not count that ...
	gug = bug :-)
Gabriele 22-Jun-2005 [235]	hmm, either use append/only, or as you say add it manually. submit this example to rambo if you think it's a bug.
Pekr 22-Jun-2005 [236x2]	append/only will not help, result of parse will varry, and it should not ...
Pekr 22-Jun-2005 [236x2]	I will put that in RAMBO then ...
Gabriele 22-Jun-2005 [238]	append/only will, because pick returns none if a column is not present, and set works with that too
Pekr 22-Jun-2005 [239]	but I like to use flat structure and foreach [real name of vars here], so I need consistent record length :-)
Gabriele 22-Jun-2005 [240]	i do too usually, just suggesting an alternative :-)
Pekr 22-Jun-2005 [241]	hmm, if (length? tmp) <> 7 [append tmp ""] will hopefully help :-)
Allen 22-Jun-2005 [242]	Pekr for now, just add an extra "," as you parse each row. That will give you a consistent length with the current behaviour
Pekr 22-Jun-2005 [243]	oh no, I am at the ends ... so bye bye beautifull oneliners ... I just found item which contains set of quotes :-) rebol will not translate that and my block is confused once again :-)
older newer	first last