World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
swall 27-Mar-2009 [3622] | Steeve: that seems to have done it. thanks for clarifying. |
Gabriele 28-Mar-2009 [3623] | or use #[none] instead |
Pavel 29-Mar-2009 [3624] | Gabriele what #[none] really does/means? I've seen it few times having no clue about its functionality. |
Henrik 29-Mar-2009 [3625x2] | Pavel, try: mold/all none |
it's just a serialized version of none!, so you can load it as a real none value instead of a word. | |
[unknown: 5] 29-Mar-2009 [3627] | Pavel, this also works with datatypes. For example: >> mold/all string! == "#[datatype! string!]" This is useful if your loading values from a file. This way your sure to set a value to a string datatype! when desired. |
Gabriele 31-Mar-2009 [3628] | #[none] is the value of the word 'none. It is the literal representation of the value of type none!. |
Pavel 31-Mar-2009 [3629] | THX for description to all |
Janko 15-Apr-2009 [3630] | Hi, I have one question .. can you somehow break out of some loop by rebol code .. for example parse [ aa zzz cc ] [ some [ set W word! ( ?? W if equal? W 'zzz [ break ] ) ] ] ... that break doesn't work that way, but is there some way to do this? I need to compare W with a runtime value |
Graham 15-Apr-2009 [3631] | throw an error? |
Janko 15-Apr-2009 [3632] | I solved it in a way that I can just return out of whole function (with return) at that point so it's ok .. first I had it thought out in a way that I would need to exit the some [ ] loop but continue parsing .. error probably wouldn't work that way either? This is now my code..match: match func [ data rules ] [ parse rules [ SOME [ set L lit-word! ( either equal? L reduce first data [ data: next data ] [ return false ] ) | set W word! ( set :W first data data: next data ) ] ] ] |
Ammon 16-Apr-2009 [3633] | ; Here's one way to do it... >> digit: charset "1234567890" == make bitset! #{ 000000000000FF03000000000000000000000000000000000000000000000000 } >> rule: [s: some digit e: (print copy/part s e) | h: #"a" (h: tail h) :h | skip ] == [s: some digit e: (print copy/part s e) | h: #"a" (h: tail h) :h | skip] >> parse "12b34c56a78" [any rule] 12 34 56 == true |
Dockimbel 16-Apr-2009 [3634] | Another possible way is by setting at runtime a [break] rule : branch-rule: [ ] parse [ aa zzz cc ] [ some [ set W word! ( ?? W if equal? W 'zzz [ branch-rule: [ break ] ] ) branch-rule ] ] |
Janko 16-Apr-2009 [3635] | Ah, thanks Ammon and Dockimbel! haven't thought of these two ways (well I don't yet fully understant Ammon's) |
shadwolf 16-Apr-2009 [3636x5] | charset create a "mask" in bitset form to be compared to the curent item read from the string |
some digit since digit is a bitset containing the binary image of what you looking for (numbers char from 1 to | |
that means each content of the string will be compare to the mask and if that mach then you proceed to the calculation | |
the equivalent lame would be someting like foreach a string [ either find? "1234567890" a [ append e a ][probe e clear e ] ] | |
so the ammon solution using charset / bitset and parse is the totally rebolish way | |
[unknown: 5] 16-Apr-2009 [3641] | parse [aa zzz cc][some [set w word! (?? w cont: if w = 'zzz [[end skip]]) cont]] |
Ammon 17-Apr-2009 [3642x2] | Essentially what I'm doing with the above code is simply skipping to the end of the parse input when a given rule is matched. This works because a get-word in the parse rules sets the current parse input. The get-word can be any value of the same type as the original parse input. You can't set the parse input to a string! if a block! was provided to parse to start with. |
Using your code to do the same thing... match func [ data rules ] [ parse rules [ SOME [ set L lit-word! blk: ( either equal? L reduce first data [ data: next data ] [ blk: tail blk ] ) :blk | set W word! ( set :W first data data: next data ) ] ] ] | |
Graham 23-Apr-2009 [3644] | I'd like to take an english sentence and tidy it up. I want to automatically apply english grammar to it ... so capitalize the first letter after a period, and remove extraneous spaces eg. a comma after a space. Anyone done anything like this with 'parse? |
Ammon 24-Apr-2009 [3645] | Not yet but I've been thinking about it for quite a while now... I think I have a pretty good idea what the parse rules should look like but I haven't written any code for it yet. |
Steeve 24-Apr-2009 [3646] | Good start... letter: charset [#"a" - #"z" #"A" - #"Z"] dirt: complement letter word: [some letter] clean: [here: dirt :here (remove here)] space: [here: (insert here #" ") skip] capital: [here: letter (uppercase/part here 1)] sentence: [ some [ capital opt word break | clean ] any [ [#";" | #","] any clean space word | #"." any clean space capital opt word | #" " word | clean ] ] parse/all text: {test test . test;; test ..test } sentence probe text >>"Test test. Test; test. Test" |
Janko 24-Apr-2009 [3647x2] | I have made auto capitalising first words for some bot once .. it wasn't anything special , I can find the code and send it to you |
ah, Steeve's already works | |
Steeve 24-Apr-2009 [3649] | Has to be ehanced indeed |
Graham 24-Apr-2009 [3650] | Hey, nice start ... |
Steeve 24-Apr-2009 [3651] | indeed, i'm nice |
Graham 24-Apr-2009 [3652x2] | :) |
have to add #"'" ie. ' to the letter charset | |
Steeve 24-Apr-2009 [3654x2] | #"-" too and what with the numbers ? |
for #"'" you should add a rule to remove spaces | |
Janko 24-Apr-2009 [3656] | Mine was meant so I cold make pretty texts with all upper case in some search engine.. maybe it doesn't work that great in all cases.. smart-uc-after: func [ str sep ] [ parse str [ ANY [ thru sep mark: ( uppercase/part trim mark 1 insert mark " " ) :mark ] ] str ] smart-case: func [ str ] [ calc-with X [ [ lowercase str ] [ uppercase/part X 1 ] [ smart-uc-after X "." ] [ smart-uc-after X "?" ] [ smart-uc-after X "!" ] ]] >> smart-case "HI HOW ARE YOU! we will go. bye!" == "Hi how are you! We will go. Bye! " |
Graham 24-Apr-2009 [3657] | numbers aren't usually part of words. Unless it's trademark like 3M |
Janko 24-Apr-2009 [3658x2] | but mine is also worse because it does 3 parses instead of one like Steeve |
calc-with: func [ 'wrd bs ] [ foreach b bs [ set wrd do b ] ] ; it uses this func also | |
Graham 24-Apr-2009 [3660] | Stevee's looks faster :) |
Janko 24-Apr-2009 [3661] | yes, I agree :) |
Steeve 24-Apr-2009 [3662x4] | this is the rule for #"-" | #"'" any clean word |
with that you supress unwanted spaces. it' s a good day --> "it's a good day" | |
so don't add ""'" as a vali | |
d letter | |
Graham 24-Apr-2009 [3666] | ahh ... |
Steeve 24-Apr-2009 [3667] | do as you want... :-) |
Graham 24-Apr-2009 [3668x2] | trailing "." or "," gets lost |
Also, I think have to add ' to the letter charset because words ending in s can have a trailing ' for possession ... | |
Steeve 24-Apr-2009 [3670] | but what if they have inserted a space after or before ' |
Graham 24-Apr-2009 [3671] | so, Miles' wallet and not Miles's wallet |
older newer | first last |