r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Tomc
5-Jun-2005
[194]
rebol []

; CamelCase Test


test-text: "FirstWord test. This is a CamelCase test Text. CamelCase2 
is the base idea for a WiKi. CamelcasE"

upper-case: charset "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
delimiter:	charset " .,;|^-^/"
rest-char: complement union upper-case delimiter

text: copy ""


camelcase-rule: [some [upper-case some rest-char upper-case any rest-char] 
delimiter]

parse/all/case test-text[
	some [ 
			copy camelcase-word  camelcase-rule
				(if not empty? text [?? text clear text]
		 		print ["CamelCase word found: " camelcase-word]
				)
			| 
			copy flowtext upper-case 
				(append text flowtext)
			|
			copy flowtext[any [rest-char | delimiter]] 
				(append text flowtext)
	]
]
halt
Graham
5-Jun-2005
[195x2]
what about camelCAse?
Personally I prefer the way mediawiki does it ... using [[ .. ]] 
... instead of having strange cases in words
Tomc
5-Jun-2005
[197]
yes, I was also concerned about  A1CamelCase  but figured Robert 
 just needed to get thru his first question first
Robert
6-Jun-2005
[198x2]
Thanks, for the fix. Sometimes it helps to get some distance by asking 
others :-))
I like CamelCase words. Simple to remember and use. IIRC camelCAse 
is not a valid CamelCase word. But anyway, it depends how I teach 
my users :-))
Graham
6-Jun-2005
[200]
http://en.wikipedia.org/wiki/CamelCase... CamelCase is referred 
to UpperCamelCase, and camelCase is referred to as lowerCamelCase
Robert
6-Jun-2005
[201]
Tom, your example doesn't terminate, like mine. The thing IMO is 
that the last Word is a CamelCase word and the 'end condition is 
somehow missed. It nevery reaches the halt.
sqlab
6-Jun-2005
[202]
If you do not want to change the parse rules, you can just add
	if not flowtext [halt]
before
	append text flowtext
Robert
6-Jun-2005
[203]
I can change the parse rules. This is just a test script, the rule 
needs to be included in a broader parsing engine. So, it must return 
TRUE.
Tomc
6-Jun-2005
[204]
Robet you also have to worry about  YaBaDaBaDoCamelCases  (even and 
odd) 

to get it to return true ,  figure out what is left when the outter 
most  some finishes.
parse ...[
	some [
		...
	]
	copy remenant to end ( print remenant)
]


then make  the  your rule cpnsume the remenant ok if you  don't care 
just put a  
	to end
there
sqlab
7-Jun-2005
[205x2]
You can either put your parse in a catch [] and throw a true if not 
flowtext 
or something like this
parse/all/case test-text [
	some [

   copy camelcase-word [upper-case some rest-chars upper-case any rest-chars] 
   (
		 	if not empty? text [?? text clear text]
		 	print ["CamelCase word found:" camelcase-word]
		)

  | copy flowtext [some [rest-chars | upper-case] any delimiters] (
			append text flowtext
		)
		| copy flowtext [some delimiters] (
			append text flowtext
		)
	] to end
]
addendum/corrected
	] to end (if not empty? text [?? text])
MichaelAppelmans
7-Jun-2005
[207x2]
getting the following error  when running Didec's delete email script 
against a mailbox with large number of emails (250+):internal limit 
reached: parse
Near: [parse data maillist
   addr-list]
Where: parse-mail-list
is this a rebol internal limit of should i start debugging?
Graham
7-Jun-2005
[209x2]
probably a parse limitation
I think I've opened up mailboxes with over 400 emails before using 
Cerebrus' mailbox manager with no problems
MichaelAppelmans
7-Jun-2005
[211]
oh well. thanks :)
Graham
7-Jun-2005
[212]
what you could do, is extract Didier's implementation of the TOP 
command, and then get the first line of each header in your mailbox. 
 If it has the return-path set to <>, then note it in a list.  When 
finished, go thru and issue deletes on all of those.
MichaelAppelmans
7-Jun-2005
[213]
thanks! I'll have a look at that.
Gabriele
7-Jun-2005
[214]
is the To: line very, very long? there's a recursion limit in the 
parser for the address list. since you are probably not interested 
in parsing the To: header, maybe you can disable it in import-email.
Robert
8-Jun-2005
[215x2]
Hmm... my parse still not termines the 'some part. I never reach 
the end. The problem is that the rest of the string is "" and this 
seems not to be handled.
Ok, got it. Now it works.
MichaelAppelmans
9-Jun-2005
[217x2]
newby here: can anyone direct me to a sample of  code which matches 
a pattern over multiple text lines? I need to process a 5MB text 
file and remove all patterns of multiple consecutive email address 
es eg. [foo-:-my-:-com]; [foou-:-you-:-net] except the multiple email address 
string spans mulitple lines. Thanks for any pointers
and the multiple email address string occurs multiple times ;)
Brock
9-Jun-2005
[219x2]
Michael, this is going to be a very general response to your request. 
 Review setting up parse rules and use something like...
   parse text any [rule1 | rule2 | rule3]
Here's some Parse documentation from the Rebol/Core guide to get 
you started.  Share your progress and questions, maybe a line of 
sample data or two and maybe I can be of more help.
Brock
10-Jun-2005
[221]
link would help!!!   http://www.rebol.com/docs/core23/rebolcore-15.html
MichaelAppelmans
11-Jun-2005
[222]
thanks Brock.. I wound up doing it in Perl, as I'm more familiar 
with it's regex support. The problem always seems I'm in a hurry 
with crisis response which is not a good learning environment ;)
Volker
11-Jun-2005
[223]
Sometimes it helps to parse in two steps. a loop for each line-group 
and parsing that group seperately. becaus ethen 'to/'thru work better.
Ammon
16-Jun-2005
[224]
Can anyone give me some insight on how to use Brett's visual parse 
tools?
MichaelB
16-Jun-2005
[225]
Can somebody explain me why 'parse fails in the first case and returns 
true in the second case ?

r: [any into ['a (print 'a)]]
t: [[a][a][a]]
print parse t r 
->
a
a
a
false

r: [any [into ['a (print 'a)]]]
print parse t r 
->
a
a
a
true 


In the second 'r(ule) the additional [ ] make it kind of explicit, 
but shouldn't return the first version true as well ? Am I forgeting 
something what 'parse "thinks" when looking at the first 'r(ules) 
? Thanks for hints. :-)
Ladislav
16-Jun-2005
[226]
this really looks like violating the principle of the least surprise, 
I suggest you to submit it to Rambo
Romano
16-Jun-2005
[227]
MichaelB, It is a parse bug for me.
MichaelB
17-Jun-2005
[228]
As Ladislav suggested, I put it to Rambo. Thanks.
Pekr
22-Jun-2005
[229]
I have CSV file and I have trouble using parse one-liner. The case 
is, that I export tel. list from Lotus Notes, then I save it in Excel 
into .csv for rebol to run thru. I wanted to use:

foreach line ln-tel-list [append result parse/all line ";"]


... and I expected all lines having 7 elements. However - once last 
column is missing, that row is incorrect, as rebol parse will not 
add empty "" at the end. That is imo a bug ...
Gabriele
22-Jun-2005
[230x2]
btw, why /all? shouldn't excel surround elements with spaces with 
quotes?
anyway, is it really a problem that the last column is missing?
Pekr
22-Jun-2005
[232x3]
I used csv = semicolon separated values and no quotes in-there ...
yes, it is, as I expect all lines having 7 elements ... once there 
is not such an element, I can't loop thru result ... well, one condition 
will probably solve it, but imo it is a gug .... rebol identifies 
;; and puts "" inthere, but csv, at the end, will use "value;", and 
rebol does not count that ...
gug = bug :-)
Gabriele
22-Jun-2005
[235]
hmm, either use append/only, or as you say add it manually. submit 
this example to rambo if you think it's a bug.
Pekr
22-Jun-2005
[236x2]
append/only will not help, result of parse will varry, and it should 
not ...
I will put that in RAMBO then ...
Gabriele
22-Jun-2005
[238]
append/only will, because pick returns none if a column is not present, 
and set works with that too
Pekr
22-Jun-2005
[239]
but I like to use flat structure and foreach [real name of vars here], 
so I need consistent record length :-)
Gabriele
22-Jun-2005
[240]
i do too usually, just suggesting an alternative :-)
Pekr
22-Jun-2005
[241]
hmm, if (length? tmp) <> 7 [append tmp ""] will hopefully help :-)
Allen
22-Jun-2005
[242]
Pekr for now, just add an extra "," as you parse each row. That will 
give you a consistent length with the current behaviour
Pekr
22-Jun-2005
[243]
oh no, I am at the ends ... so bye bye beautifull oneliners ... I 
just found item which contains set of quotes :-) rebol will not translate 
that and my block is confused once again :-)