World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Chris 9-Feb-2009 [3485x2] | I'd say MacDonald, but I'm not one, so don't know. |
One side of my family have the convenient Ross, the other dropped the Mc to leave Gill (back in time somewhere) | |
Graham 9-Feb-2009 [3487x3] | They call it a big Mac not a big Mc ... odd |
when it's McDonalds | |
I guess they're being inclusive | |
Chris 9-Feb-2009 [3490] | As far as I'm aware, Mc and Mac are interchangeable. |
Graham 9-Feb-2009 [3491x2] | In legal documents? Interesting. |
I'm grabbing my phone book .... | |
BrianH 9-Feb-2009 [3493] | My family switched away from the Scottish spelling too, back in the 19th century when that branch came to the US. |
Chris 9-Feb-2009 [3494] | Didn't say that, just usage. |
BrianH 9-Feb-2009 [3495] | Each family picks one spelling and sticks with it nowadays, mostly because of those legal documents. |
Graham 9-Feb-2009 [3496x2] | Yep, my phone book has the Macleans between the Mcleans |
so the alphabetical ordering system they're using treats mc and mac the same | |
Chris 9-Feb-2009 [3498] | B: from what name? |
BrianH 9-Feb-2009 [3499x2] | Phone book sorting - that's really complex :( |
Halle | |
Chris 9-Feb-2009 [3501] | Sounds nordic... |
BrianH 9-Feb-2009 [3502x2] | To Hawley, the English spelling. To reduce prejudice in the US. |
It's old Celtic. | |
Graham 9-Feb-2009 [3504x2] | Apple MacIntosh ?? |
I think I'll skip Macs | |
Chris 9-Feb-2009 [3506] | As opposed to MacKintosh. |
Steeve 9-Feb-2009 [3507] | you can't guess, you need the list of all clans :) |
Janko 14-Feb-2009 [3508x4] | hi, it's me again with parse problems... I need this concretely to parse out web-page meta tags.. but I distilled the problem out of it to a minimal example.. |
doc1: "start A 1 end start B 2 end" how can you get value of 2 out | |
It works with a because it's first , but becasuse it enters the "parse" with it and then doesn't match it doesn't again test the B >> parse doc1 [ "start" "A" copy R to "end" (print R) to end ] 1 == true >> parse doc1 [ "start" "B" copy R to "end" (print R) to end ] == false | |
I thought it will recheck if I put it into something like SOME [ ] but it doesn't parse doc1 [ SOME [ "start" "B" copy R to "end" (print R) to end ] ] | |
kib2 14-Feb-2009 [3512] | Maybe ? parse/all doc1 [ thru "B" copy number to "end" (print number) ] But I'm beginning with parse, so I'm not an expert |
Janko 14-Feb-2009 [3513x2] | This would work in this case but I need to get "2" only if sequence before it is exactly previous two "start" "B" XX "end" ... there can be "B" in other places of the string and it musn't take that (I am used on using thru and to too but I musn't use them in this case for this reason as it might just skip to some "B" |
>> doc1: "start A 1 end xyz B 2 end" ;; in this case it must not take 2 == "start A 1 end xyz B 2 end" >> parse doc1 [ "start" thru "B" copy R to "end" (print R) to end ] ;; but it will that's why I can't u se to\thru 2 == true | |
Anton 14-Feb-2009 [3515] | some ["start" ["A" | "B"] copy R to "end" "end"] |
Janko 14-Feb-2009 [3516] | ups ... my example above is wrong .. just a sec |
Anton 14-Feb-2009 [3517] | no, hang on... |
Janko 14-Feb-2009 [3518x2] | Anton this would return me 1 probably ? |
(this is the right example .. I forgot to use thru above so second wouldn't pass anyway... but result is the same) >> doc1: "start A 1 end start B 2 end" == "start A 1 end start B 2 end" >> parse doc1 [ thru "start" "A" copy R to "end" (print R) to end ] 1 == true >> parse doc1 [ thru "start" "B" copy R to "end" (print R) to end ] == false >> parse doc1 [ SOME [ thru "start" "B" copy R to "end" (print R) to end ] ] == false | |
Anton 14-Feb-2009 [3520] | Is there anything expected between "start" and "A", for instance ? |
Janko 14-Feb-2009 [3521] | I know how to solve this by making it less robust (in this case relying that there is only one space between) but this doesn't solve my problem well >> parse doc1 [ thru "start B" copy R to "end" (print R) to end ] 2 == true |
Anton 14-Feb-2009 [3522] | No need for that. |
Janko 14-Feb-2009 [3523] | 1 or more spaces (to your question) |
Anton 14-Feb-2009 [3524] | parse doc1 [some [thru "start" ["A" | "B"] copy R to "end" (?? R) "end"]] |
Janko 14-Feb-2009 [3525] | hm.. just a sec so I try few things |
Anton 14-Feb-2009 [3526] | PARSE without the /ALL refinement handles any amount of whitespace. (You will probably end up using parse/all, though. I usually do when parsing HTML.) |
Janko 14-Feb-2009 [3527] | Your solution, I thought it won't work if I reverse order of A and B in the string but it seems it does. I would need to know which one is A and B but I think this can be solved by setting some word ( ) inside [ A | B] ... so basically it seems to work... I think I can apply this way also to my concrete problem which is this |
kib2 14-Feb-2009 [3528] | I don't understand why not simply : parse/all doc1 [ thru "start B" copy number to "end" (print number) ] |
Anton 14-Feb-2009 [3529x2] | You leave the pointer at beginning of "end" in the doc1 string. Look at my example, I move TO "end", then I also consume "end". |
... to "end" (?? R) "end"] | |
Janko 14-Feb-2009 [3531] | kib2: becasue I don't know how many spaces are between start and B .. and in my concrete case I need to have multiple rules.. I will show concrete example |
Anton 14-Feb-2009 [3532] | The second one actually consumes the "end", moving the pointer (the current parse index) through it. |
kib2 14-Feb-2009 [3533] | Anton: Janko just said he wanted to extract the "2", so I don't care wheter the pointer is, no ? |
Anton 14-Feb-2009 [3534] | Mmm.. probably true, but better to be neat and tidy with rules, then they can be reused in slightly different ways and still work as expected. |
older newer | first last |