r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Rebolek
24-May-2007
[1759x2]
great, thanks!
Is there some way to make this work: parse "aaa" [some "a" "a"] or 
PARSE just don't work this way?
Geomol
24-May-2007
[1761]
What do you mean?
>> parse "aaa" [some "a"]
== true

Why the second "a"?
Rebolek
24-May-2007
[1762]
It may seem strange I know, but this is automaticaly created rule
Geomol
24-May-2007
[1763]
Parsing for [some "a" "a"] will return false, because you've already 
parsed past the "a"s.
Rebolek
24-May-2007
[1764]
OK I need to find some other way :) Is it possible to go back in 
parse? -1 skip doesn't seem to work.
Geomol
24-May-2007
[1765]
I was thinking the same. I seem to remember, that at some time (some 
version of REBOL), -1 skip did work!? Hmm...
Rebolek
24-May-2007
[1766]
Wasn't it just proposed for R3?
Geomol
24-May-2007
[1767]
A clumsy way of doing it:
>> parse "aaa" [some "a" p: (p: skip p -1) :p "a"]
== true
Rebolek
24-May-2007
[1768]
OK thanks, that may help
Anton
24-May-2007
[1769x2]
That's not so clumsy. You want to backtrack and that's what you're 
doing.
obviously you could use (p: back p)
Rebolek
24-May-2007
[1771]
Even better. Thanks Anton. Seems that "-1 skip" should not be that 
hard to implement
BrianH
24-May-2007
[1772]
parse "aaa" [some [p: "a"] :p "a"]
Rebolek
24-May-2007
[1773]
I think this needs (p: back p) before :p.
BrianH
24-May-2007
[1774x2]
Not in my version. The p is set before the position advances past 
the "a", so it is already back.
The p is reset before "a" is consumed - that is why I put [p: "a"] 
in [].
Rebolek
24-May-2007
[1776]
So why it does return 'false here? p is empty on :p.
BrianH
24-May-2007
[1777x3]
Interesting. It seems to be setting the last p before it fails on 
the last iteration of "a".
Clearly I need a temporary.
parse "aaa" [some [p1: "a" (p2: :p1)] :p2 "a"]
Anton
24-May-2007
[1780]
I have done this sort of thing before.
Rebolek
24-May-2007
[1781]
temporary, or step back
BrianH
24-May-2007
[1782]
A temporary will work better with parts of unknown size, and be faster 
too.
Rebolek
24-May-2007
[1783]
OK
BrianH
24-May-2007
[1784x2]
Still, you might want to apply rewrite rules to your generated parse 
rules - that code seems a little sloppy.
Peephole fixing?
Rebolek
24-May-2007
[1786]
rewrite rules?
Oldes
24-May-2007
[1787]
that you will not have [some "a" "a"] but just [some "a"]
Rebolek
24-May-2007
[1788]
Well, I'm not exactly sure if that's possible, I have to do some 
tests
BrianH
24-May-2007
[1789x2]
By rewrite rules, I mean something like what Gabriele came up with 
for the rebcode assembler a while ago. Since I helped refine his 
work, I may still have a copy somewhere. I'll take a look.
I'm not sure I helped with this one, now that I think of it. It's 
one of his literate programming projects, here:
http://www.colellachiara.com/soft/Misc/
Look for rewrite.*
Gabriele
24-May-2007
[1791]
that one is different from the one in rebcode, but the principle 
is about the same.
Rebolek
24-May-2007
[1792x2]
OK I'll check it, thanks
is it possible to convert bitset! back to something readable?
Geomol
24-May-2007
[1794]
Define readable! ;-) Maybe you could use a combination of to-string, 
to-binary, debase and things like that.
Rebolek
24-May-2007
[1795]
if i do (a: charset "abc") i want to do also (decharset a) to get 
"abc" :) that's readable ;)
Volker
24-May-2007
[1796]
loop thru all possible chars, print matches ;)
Rebolek
24-May-2007
[1797]
yes, but that's really not the fastet way :)
Geomol
24-May-2007
[1798]
Rebolek, use my hokus-pokus function:

hokus-pokus: func [
	value
	/local a out
][
	either bitset? value [
		a: enbase/base to-binary value 2
		out: copy ""
		forall a [
			if a/1 = #"1" [append out to-char (index? a) - 4]
		]
		out
	][
		42
	]
]

>> a: charset "abc"
>> hokus-pokus a
== "abc"
BrianH
24-May-2007
[1799x3]
bitset-to-string: func [b [bitset!] /local s] [
    s: copy ""
    repeat x 256 [
        x: to-char x - 1
        if find b x [append s x]
    ]
    s
]
; Sorry, error...
bitset-to-string: func [b [bitset!] /local s x c] [
    s: copy ""
    repeat x 256 [
        c: to-char x - 1
        if find b c [append s c]
    ]
    s
]
That should be pretty fast, and it doesn't involve huge binary temporaries.
Gregg
24-May-2007
[1802]
http://www.codeconscious.com/rebol/scripts/bitsets.r
BrianH
24-May-2007
[1803x3]
To compare those, it looks like

    repeat x 256 [
        c: to-char x - 1
        if find b c [append s c]
    ]

would be faster than

    for i 0 255 1 [

        if parse/all to-string test-char: to-char i reduce [ bitset ] [
            append result test-char
        ]
    ]


because of the reduce, the to-string. the parse, and the use of the 
mezzanine for instead of the native repeat.
Your's is more flexible, though.
Lots of other interesting stuff on that site.
Gregg
24-May-2007
[1806]
Yes, Brett has built a lot of very cool stuff. Haven't seen him around 
for a while though.
Rebolek
26-May-2007
[1807]
So, this is my first attempt to do regular expressions in REBOL. 
Type on your console:
do http://bolek.techno.cz/reb/regex.r

Some things are missing and it can sometimes run in endless loop 
when it shouldn't, so please be benevolent :)

But at least the email regex can be translated and parsed succesfully.
Henrik
26-May-2007
[1808]
it's quite small