World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
amacleod 22-Feb-2009 [3601]	never mind I see my prob...
MaxV 20-Mar-2009 [3602]	Hello everybody! I have a problem. I need to extract email addresses from a big text like bla bla [me-:-demo-:-com] bla bla ... <[you-:-example-:-org]> etc. [he-:-italy-:-it] There is possible to obtain a text with all the addresses withou the "<" and ">"?
Pekr 20-Mar-2009 [3603]	I am not sure I understand what you are upto ....
Maxim 20-Mar-2009 [3604]	do you want both emails within the <> and those without?
Geomol 20-Mar-2009 [3605]	>> str: "bla bla [me-:-demo-:-com] bla bla ... <[you-:-example-:-org]> etc. [he-:-italy-:-it]" >> foreach w parse str none [if find e: to-email load w "@" [print e]] [me-:-demo-:-com] [you-:-example-:-org] [he-:-italy-:-it] or something.
Pekr 20-Mar-2009 [3606x3]	eh, nice :-)
	Here's absolutly terrible parser - it does NOT follow RFC, allow any combination of alpha chars, dots, one @ char, and the same, once again to the next space char ... space: #" " mailchar: charset [#"0" - #"9" #"A" - #"Z" #"a" - #"z" ".-"] at-char: #"@" email: [ space start: some mailchar at-char some mailchar end: space (print copy/part start end) ] str: "afadfa adfa asdfasdfa fd [asdfas-:-adfadf-:-adfa-adfadfsda-:-com] adfafaf a af" parse/all str [any [email \| skip]]
	That eliminates email adresses inside of < >, but maybe it was not an intention?
btiffin 20-Mar-2009 [3609]	It would be nice if REBOL could LOAD foreign! data. :) Hint hint wink wink. And being here in a public REBOL forum I might get in trouble for suggesting this one. $ grep -o -E '\b[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b' files...
Pekr 20-Mar-2009 [3610]	Brian ... you post is broken ... it contains some strange binary fragments :-)
Geomol 20-Mar-2009 [3611]	Brian, you can probably do that grep with a few CHARSET and PARSE in REBOL.
btiffin 20-Mar-2009 [3612]	And actually I think it's wrong anyway ... as it should be. Posting regex in a REBOL forum ... shame on me. ;)
MaxV 23-Mar-2009 [3613]	Thank you, I'll try Pekr solution. I don't need the "<" and ">" characters. However, where I can found some good parse documentation?
Brock 23-Mar-2009 [3614]	Rebol Parse documentation: http://www.rebol.com/docs/core23/rebolcore-15.html
Chris 23-Mar-2009 [3615]	http://www.codeconscious.com/rebol/parse-tutorial.html
swall 27-Mar-2009 [3616]	I'm having trouble parsing the "none" datatype from within blocks. The following example illustrates my problem (hopefully): junk: [none [1 2 [3 4]]] parse/all junk [none (print ["nothing"]) text: (print ["text:" mold text]) set b block! (print ["block:" mold b])] This produces the following output: nothing text: [none [1 2 [3 4]]] == false Notice that the block doesn't get parsed. It seems that parse ignores "none" tokens rather than extracting them from the input stream. If I put a number in place of none and parse for "number!", then the block does indeed get parsed. Is this a bug or an oversight? Or am I just confused?
Izkata 27-Mar-2009 [3617]	'none isn't a datatype - none! is: >> parse/all junk [none! (print ["nothing"]) text: (print ["text:" mold text]) set b block! (print ["block:" mold b])] nothing text: [[1 2 [3 4]]] block: [1 2 [3 4]] == true
swall 27-Mar-2009 [3618x2]	I tried that but it doesn't seem to work. I'm getting nothing but 'false being returned.
swall 27-Mar-2009 [3618x2]	Correction, I tried it in my actual program, rather than the test stub, and it seems to work fine. Thanks.
Steeve 27-Mar-2009 [3620]	the difference with your program is that [none] is not containing the none value but the none word. if you reduce your example , it mays work junk: reduce [none [1 2 [3 4]]]
Izkata 27-Mar-2009 [3621]	Ah, forgot to copy that part - I'd done "junk/1: none" to make sure it was a none value
swall 27-Mar-2009 [3622]	Steeve: that seems to have done it. thanks for clarifying.
Gabriele 28-Mar-2009 [3623]	or use #[none] instead
Pavel 29-Mar-2009 [3624]	Gabriele what #[none] really does/means? I've seen it few times having no clue about its functionality.
Henrik 29-Mar-2009 [3625x2]	Pavel, try: mold/all none
Henrik 29-Mar-2009 [3625x2]	it's just a serialized version of none!, so you can load it as a real none value instead of a word.
[unknown: 5] 29-Mar-2009 [3627]	Pavel, this also works with datatypes. For example: >> mold/all string! == "#[datatype! string!]" This is useful if your loading values from a file. This way your sure to set a value to a string datatype! when desired.
Gabriele 31-Mar-2009 [3628]	#[none] is the value of the word 'none. It is the literal representation of the value of type none!.
Pavel 31-Mar-2009 [3629]	THX for description to all
Janko 15-Apr-2009 [3630]	Hi, I have one question .. can you somehow break out of some loop by rebol code .. for example parse [ aa zzz cc ] [ some [ set W word! ( ?? W if equal? W 'zzz [ break ] ) ] ] ... that break doesn't work that way, but is there some way to do this? I need to compare W with a runtime value
Graham 15-Apr-2009 [3631]	throw an error?
Janko 15-Apr-2009 [3632]	I solved it in a way that I can just return out of whole function (with return) at that point so it's ok .. first I had it thought out in a way that I would need to exit the some [ ] loop but continue parsing .. error probably wouldn't work that way either? This is now my code..match: match func [ data rules ] [ parse rules [ SOME [ set L lit-word! ( either equal? L reduce first data [ data: next data ] [ return false ] ) \| set W word! ( set :W first data data: next data ) ] ] ]
Ammon 16-Apr-2009 [3633]	; Here's one way to do it... >> digit: charset "1234567890" == make bitset! #{ 000000000000FF03000000000000000000000000000000000000000000000000 } >> rule: [s: some digit e: (print copy/part s e) \| h: #"a" (h: tail h) :h \| skip ] == [s: some digit e: (print copy/part s e) \| h: #"a" (h: tail h) :h \| skip] >> parse "12b34c56a78" [any rule] 12 34 56 == true
Dockimbel 16-Apr-2009 [3634]	Another possible way is by setting at runtime a [break] rule : branch-rule: [ ] parse [ aa zzz cc ] [ some [ set W word! ( ?? W if equal? W 'zzz [ branch-rule: [ break ] ] ) branch-rule ] ]
Janko 16-Apr-2009 [3635]	Ah, thanks Ammon and Dockimbel! haven't thought of these two ways (well I don't yet fully understant Ammon's)
shadwolf 16-Apr-2009 [3636x5]	charset create a "mask" in bitset form to be compared to the curent item read from the string
	some digit since digit is a bitset containing the binary image of what you looking for (numbers char from 1 to
	that means each content of the string will be compare to the mask and if that mach then you proceed to the calculation
	the equivalent lame would be someting like foreach a string [ either find? "1234567890" a [ append e a ][probe e clear e ] ]
	so the ammon solution using charset / bitset and parse is the totally rebolish way
[unknown: 5] 16-Apr-2009 [3641]	parse [aa zzz cc][some [set w word! (?? w cont: if w = 'zzz [[end skip]]) cont]]
Ammon 17-Apr-2009 [3642x2]	Essentially what I'm doing with the above code is simply skipping to the end of the parse input when a given rule is matched. This works because a get-word in the parse rules sets the current parse input. The get-word can be any value of the same type as the original parse input. You can't set the parse input to a string! if a block! was provided to parse to start with.
Ammon 17-Apr-2009 [3642x2]	Using your code to do the same thing... match func [ data rules ] [ parse rules [ SOME [ set L lit-word! blk: ( either equal? L reduce first data [ data: next data ] [ blk: tail blk ] ) :blk \| set W word! ( set :W first data data: next data ) ] ] ]
Graham 23-Apr-2009 [3644]	I'd like to take an english sentence and tidy it up. I want to automatically apply english grammar to it ... so capitalize the first letter after a period, and remove extraneous spaces eg. a comma after a space. Anyone done anything like this with 'parse?
Ammon 24-Apr-2009 [3645]	Not yet but I've been thinking about it for quite a while now... I think I have a pretty good idea what the parse rules should look like but I haven't written any code for it yet.
Steeve 24-Apr-2009 [3646]	Good start... letter: charset [#"a" - #"z" #"A" - #"Z"] dirt: complement letter word: [some letter] clean: [here: dirt :here (remove here)] space: [here: (insert here #" ") skip] capital: [here: letter (uppercase/part here 1)] sentence: [ some [ capital opt word break \| clean ] any [ [#";" \| #","] any clean space word \| #"." any clean space capital opt word \| #" " word \| clean ] ] parse/all text: {test test . test;; test ..test } sentence probe text >>"Test test. Test; test. Test"
Janko 24-Apr-2009 [3647x2]	I have made auto capitalising first words for some bot once .. it wasn't anything special , I can find the code and send it to you
Janko 24-Apr-2009 [3647x2]	ah, Steeve's already works
Steeve 24-Apr-2009 [3649]	Has to be ehanced indeed
Graham 24-Apr-2009 [3650]	Hey, nice start ...
older newer	first last