World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Graham 17-May-2009 [3790x4] | I've come across too many situations where parse has broken on me .... |
because the rules I wrote weren't comprehensive enough | |
block parsing can never be used on real world data ... | |
( exaggeration on my part ) | |
Steeve 17-May-2009 [3794] | ;-) |
Maxim 17-May-2009 [3795x2] | remark v1: uses series handling, funcs, and a lot of code to get it to work. prbably about 200 lines. remark v2: 20 line parse rule + 5line stack context object. v2 is 50 times faster, and does twice as more, while being much more flexible in many api aspects. parse is powerfull, but it took me 4 years to understand parse well enough in order to rewrite remark. |
block parsing really is only to create friends in the rebol community ;-) | |
Steeve 17-May-2009 [3797x2] | ahah |
or enemies | |
Maxim 17-May-2009 [3799] | enemies? |
Graham 17-May-2009 [3800] | I think he means energetic discussions |
Steeve 17-May-2009 [3801x2] | yes we fight by throwing snipsets |
take that ! >>forever [wait 1000000] | |
Maxim 17-May-2009 [3803] | ok, well we're still friends then, since this was string parsing ;-D |
Henrik 31-May-2009 [3804] | I haven't kept up with the latest parse bugs, but I was wondering about this: >> parse/all {"abc","def"^/"ghi","jkl"} "^/" == ["abc" {,"def"} "ghi" {,"jkl"}] According to my logical sense, it should only split at the newline. |
Maxim 31-May-2009 [3805] | strange bug |
Henrik 31-May-2009 [3806] | it's the quotes that do it: >> parse/all {""""""} "^/" == ["" "" ""] |
Graham 31-May-2009 [3807] | parsing thru quotes is always problematic |
Maxim 31-May-2009 [3808x2] | its like if its temporarily switching to block mode within a string mode parsing. :-( |
IIRC carl once said that the simple rule parse was meant to be used to parse CSV... so that might explain it. | |
Henrik 31-May-2009 [3810x2] | strangely enough, it makes parsing CSV with quotes much more difficult, so I had to work around it. |
for proper CSV parsing, we'll need some good functions for R3/Plus instead of trying to do some crappy stuff with PARSE directly. | |
Chris 31-May-2009 [3812] | The only place it seems to be useful is for parsing search or tag strings >> parse {painting "mona lisa" art} none == ["painting" "mona lisa" "art"] But having simple mode act as 'split (in the absence of a 'split function) would be of more value. It's particularly irksome that you can't easily 'split using newlines... |
Tomc 2-Jun-2009 [3813] | I am in favor of having a simple split function if it helps rationalize parse |
Ladislav 2-Jun-2009 [3814] | Simple split: check http://en.wikibooks.org/wiki/REBOL_Programming/Language_Features/Parse#Simply_split |
Henrik 2-Jun-2009 [3815] | That is not simple. :-) |
Ladislav 2-Jun-2009 [3816x2] | what? just use it |
;-) | |
Henrik 2-Jun-2009 [3818x2] | anyhoo, SPLIT could be backported from R3, if BrianH has not already done that. |
although with upcoming parse changes it might need to be rewritten. SPLIT is rather big. | |
BrianH 2-Jun-2009 [3820] | I haven't gone over the code in SPLIT yet. Something about the API seems wrong, though not as bad as FORMAT. Once it iss more settled I'll backport SPLIT to R2 and put it in R2-Forward. |
Pekr 5-Jun-2009 [3821] | I am trying to create primitive script, which investigates user/group/system rights on our filesystem (no Identity Management system here). The trouble is, that MS programmers have some weak days probably too :-) They forgot to add one stupid newline to the output of ICACLS, so I get following kind of outputs: L:\Sprava\Personalni usek WALMARK\RUR:(OI)(CI)(F) L:\Sprava\Personalni usek (OI)(CI)(F) L:\Sprava\Personalni usek NT AUTHORITY\RUR:(OI)(CI)(F) L:\Sprava\Personalni usek BUILTIN\RUR:(OI)(CI)(F) I need to come-up with rules, which will allow me to filter out path from the first user/group/rights info. The problem is, that space is regular character in path. So how to easily create rule for above cases? The path is - "L:\Sprava\Personalni usek" |
BrianH 5-Jun-2009 [3822] | If you know the path ahead of time you can skip past its length plus one, then start parsing. |
Pekr 5-Jun-2009 [3823x2] | no, I have few megabytes, done from one call to ICACLS command line .... but never mind - ICACLS is not good tool. I just wanted to use REBOL here. I will have to start using VBScript for such stuff ... |
The programmer which did the output has to be pretty much idiot though ... | |
BrianH 5-Jun-2009 [3825] | Agreed. I mean, each line starts with a path - is it the same path every time, or a different one? |
Pekr 5-Jun-2009 [3826x3] | different one ... |
I don't want to put output here, as this group is web public ... | |
ICACLS L:\my-path\*. /T > result.txt ...... /T means recursion ... so it was easy job at first sight ... | |
Ladislav 5-Jun-2009 [3829] | but, how do you *know* where the path ends, then? |
Pekr 5-Jun-2009 [3830] | exactly :-) That is why I can see it as a bug on programmer's side. OK, here's one example: L:\Some-path\Some subidr name here WALMARK\RUR:(OI)(CI)(F) BUILTIN\Administrators:(OI)(CI)(F) WALMARK\User1:(CI)(RX) WALMARK\Some group:(OI)(CI)(M) |
BrianH 5-Jun-2009 [3831] | What info do you need, the path or what comes after it? If the data after it is only a limited set of possible answers, you can try to skip to those in turn. |
Pekr 5-Jun-2009 [3832] | So I start from right, making longer rule as [rights-section | doman-section user-section rights-section] |
Ladislav 5-Jun-2009 [3833] | ...makes no sense to define a rule, if you don't actually know where the path ends, as I see it |
Pekr 5-Jun-2009 [3834] | There is one exception - "NT AUTHORITY" ... I would break both hands of the designer, which allowed this one exception - space in domain name is not normally allowed :-) |
BrianH 5-Jun-2009 [3835] | parse/all/case line [[to "WALMARK" | to "BUILTIN"] a: (do something)] |
Ladislav 5-Jun-2009 [3836] | aha, so, you actually know, where the path ends?, you didn't tell |
BrianH 5-Jun-2009 [3837] | Or to "NT AUTHORITY" |
Pekr 5-Jun-2009 [3838x2] | But you can define following rule: domain-chars: charset [#"A" - #"Z" "-"] domain-rule: [ "NT AUTHORITY\" (domain: "NT AUTHORITY") | copy domain some domain-chars "\" ] domain-user-rights: [rights-rule | domain-rule user-rule rights-rule] |
So except the NT AUTHORITY, there can't be any space. So I filtered out the when there is only rights on the first line (OI)(CI) etc. and the second case - DOMAIN\USER-GROUP:(RIGHTS) | |
older newer | first last |