World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Ladislav 17-Nov-2009 [4601] | ...except for the fact, that lit-words are used in the Do dialect (= when Rebol is concerned, as you say), when you want to write an expression, which evaluates to a specific word, so, e.g. the expression: 'a evaluates to the same value as the expression: first [a] , which happens to be the word A |
Pekr 17-Nov-2009 [4602] | http://www.rebol.com/docs/core23/rebolcore-15.html#section-6 |
Henrik 17-Nov-2009 [4603] | Depending on the situation, it can be hard to tell whether you are dealing with a word or a specific value. that's the price for freely interchangable code/data. :-) a: [none] b: copy a b: reduce b ; me doing this behind your back a == [none] ; word! b == [none] ; none! |
Pekr 17-Nov-2009 [4604] | it is a bit difficult to understand recursive rules, but :-) |
JoshF 17-Nov-2009 [4605x2] | The difference between what I'm doing and what you linked to is that it's working against a string, while I'm doing a dialect, no? |
I understood that character stuff wouldn't work in a dialect -- but my understanding is imperfect. | |
Ladislav 17-Nov-2009 [4607] | right, what you are doing is a dialect |
JoshF 17-Nov-2009 [4608] | OK. Thanks again for the timely help! I have to run off to work (which is firewalled up the yang), so you'll be able to avoid more silly questions from me for at least the next ten hours! ; - ) |
Pekr 17-Nov-2009 [4609] | Dialect is a dialect. The only difference in string vs block parsing, imo is, that with block parsing, you are using REBOL datatypes to identify/match your types, whereas with string you are more "free-form" :-) |
Janko 2-Dec-2009 [4610x2] | I know I was stopped by parse in some occasions where. I think always every time the problem would be solvable if I had for example >> to [ "A" | "B" ] where parser would check where is A and where is B and go to the closest one. |
from Advocacy --> Graham [ to "A" | to "B" ] won't work as I want .. I will try to find a concrete example | |
Graham 2-Dec-2009 [4612] | this is a current parse limitation. |
Janko 2-Dec-2009 [4613] | parse "start 111 end start 222 finish" [ some [ thru "start" copy NUMS [ to "finish | to "end" ] ] ] this wont work |
Graham 2-Dec-2009 [4614x2] | change it |
[ to "end" | to "finish" ] | |
Janko 2-Dec-2009 [4616] | ok .. but I meant that you have "start 111 end start 222 finish start 333 end " then it won't work :) |
Graham 2-Dec-2009 [4617] | change the rule again |
Janko 2-Dec-2009 [4618] | I was trying to show an example where you have two possible endings and you want to process both (and you can differently with parens) ) but you don't know in what order they will come or anything |
Graham 2-Dec-2009 [4619x3] | In this case I would use block parsing ... then I'm no expert in parsing |
parse string [ some [ "start" digits "end" | "start" digitis "finish ]] | |
your problem is because you are using 'thru which breaks the other rule | |
Janko 2-Dec-2009 [4622x2] | yes , then you have to do charset parsing (but I don't know that yet :) ) .. I was just trying to say if there would be the way to say something like "to any [ "A" | "B" ] and it would go to the closest one A LOT of problems with parse would be easily solvable |
you can use to but it still won't work | |
Graham 2-Dec-2009 [4624x3] | [ some [ "start" digits [ "end" | "finish" ] ] should work |
to go to the closest one .. means it has to try all the rules?? | |
and see which has the best fit ? | |
Janko 2-Dec-2009 [4627x2] | no wgih is the closest .. look at this example (I hope this will be better) |
whigh = which | |
Graham 2-Dec-2009 [4629x2] | I know what you mean .. so you have to order your rules knowing what the data looks like |
If you don't know what pattern the data is .. you can't parse it with anything. | |
Janko 2-Dec-2009 [4631x4] | parse "This is Apple . This is Windows ! This is Linux . This is Amiga ." [ some [ "This is" copy IT (print IT) to [ "." | "!" ] ] |
The pattern is known ... the scentence starts with this is and can end with . or ! but they can come in any order .. if you try to parse with "." first you will get ---- ops some errors upthere .. just a sec | |
>> parse "This is Apple . This is Windows ! This is Linux . This is Amiga ." [ some [ thru "This is" copy IT [to "." | to "!" ] (print IT) ]] Apple Windows ! This is Linux Amiga | |
this is the common to all problems where that I am describing .. if I had > to [ "." | "!" ] and parse would find both and go to the one that is closer it would be solved. | |
Graham 2-Dec-2009 [4635] | charset [ #"!" #"." ] |
Janko 2-Dec-2009 [4636x2] | ok , you again found a solution to my specific problem :)) |
BUT .. what if I want to have controll there .. or if for the sake of example it's a more complex multicharacter difference like "<DOT>" "<EXCLAMATION>" | |
Graham 2-Dec-2009 [4638] | Janko, best thing to do is show us a string you can't parse ... and someone will show you how to do it. |
Janko 2-Dec-2009 [4639x4] | >> parse "I like Apple . I like Windows ! I like Linux . I like Amiga ." [ [ some [ thru "I like" copy IT [to "." ( prin "so so: ") | to "!" (prin "v ery much: ") ] (print IT) ]] so so: Apple so so: Windows ! I like Linux so so: Amiga |
I don't have real example right now :) I had them few times before and I also asked here about them and I solved with your help somehow | |
I just started talking about this as a general limitation of parse that I meed a lot of times and I suppose Paul could of meet it when trying to parse CSV | |
janko ,"some\"thing92!","graham" I am not sure but I think here you have the same problem | |
Gregg 2-Dec-2009 [4643x3] | It's not necessarily a PARSE limitation, but there are things we'd like PARSE to do that aren't always reasonable. :-) TO and THRU can work very well, but that doesn't mean they'll work for every situation. You may have to use rules where you check for your target value or just SKIP, marking locations in the input as you go. |
CSV parsing is an issue, because REBOL handles some inputs well, but fails for what may be a common way things are formatted. "CSV" isn't always as simple as it sounds. | |
That said, if you know the format (e.g. WRT quotes and escapes), it can be done with PARSE. It just may not be a one-liner. | |
Janko 2-Dec-2009 [4646x2] | I know parsing csv can be messy ... at least at this high level I don't know how to do it with escapes and commas in etc |
and I know everything has limitations ... this functionality OR with taking the first that appears would just in practice solve me many cases | |
Graham 2-Dec-2009 [4648] | you have to turn off parse's default delimiters and use bitsets |
Janko 2-Dec-2009 [4649] | (aha bitsets.. I was calling them charsets upthere) |
Graham 2-Dec-2009 [4650] | BTW, Bolek wrote a regex engine in Rebol ... |
older newer | first last |