r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Chris
22-Oct-2009
[4547]
Both w1 and w+ appear to be very large values.  Would it be smart 
to perhaps do:

	[[aw1 | w1] any [aw+ | w+]]

Where 'aw1 and 'aw+ are limited to ascii values?
Steeve
22-Oct-2009
[4548x4]
Uses R3 (and his optimized complemented bitsets)
Anyway, a bitset with a length of 2 ** 16 is not so huge in memory 
(only 16kb)
64 Kb , sorry
So W1 + W+ = 128Kb

Is this a problem ?
Chris
22-Oct-2009
[4552]
That's what I'm asking.  Complemented bitsets wouldn't make a difference 
here though as the excluded range is of similar scope, right?
Steeve
22-Oct-2009
[4553x2]
It seems
if the size is a problem you can build a function to test each range.
But It will be slow
Chris
22-Oct-2009
[4555x3]
Not size, efficiency.
Allowing 'into to look inside strings can break current usage of 
'into, requiring [and any-block! into ...]
An example: a nested d: [k v] structure where 'k is a word and 'v 
is 'd or any other type:

	data: [k [k "s"]]

R2, you can validate with d: [word! [into d | skip]]


Now you have to specify: d: [word! [and any-block! into d | skip]] 
otherwise you get an error if 'v is a string!
Sunanda
25-Oct-2009
[4558]
I guess parse can do this too?

   http://stackoverflow.com/questions/1621906/is-there-a-way-to-split-a-string-by-every-nth-seperator-in-python
Will
25-Oct-2009
[4559]
is R2/Forward available for download? thx
Geomol
25-Oct-2009
[4560x2]
Sunanda, one way:

>> out: clear []

>> parse "this-is-a-string" [mark1: any [thru "-" [to "-" | to end] 
mark2: (append out copy/part mark1 mark2) skip mark1:]]
>> out
== ["this-is" "a-string"]
Another:

>> out: parse "this-is-a-string" "-"
>> forall out [change/part out rejoin [out/1 "-" out/2] 2]
>> out
== ["this-is" "a-string"]
Steeve
25-Oct-2009
[4562]
R3 one liner ;-)

>> map-each [a b] parse "this-is-a-string" "-" [ajoin [a #"-" b]]
Graham
26-Oct-2009
[4563]
Rebol doesn't have lines :)
BrianH
26-Oct-2009
[4564x2]
Chris, there can be an advantage in R3 to breaking up a bitset into 
more that one bitset on occasion, mostly memory savings. However, 
it might not work as well as you might like since offset and/or sparse 
bitsets aren't supported. Bitsets that involve high codepoints will 
take a lot of RAM no matter what you do.
Will, R2/Forward is already available for download in DevBase (R3 
chat). It is a little outdated though, since I had to take a break 
to rewrite R3's module system. I'll catch up when I get the chance. 
The percentage of R3 that I can emulate has gone down drastically 
since the last update, since R3 has made a lot of changes to basic 
datatype behavior since then. We'll see what we can do.
Steeve
26-Oct-2009
[4566x2]
Something funny.

I spent an hour debugging a parsing rule. 
To finally understand this.  
Never name a rule, LIMIT. 
LIMIT keyword is reserved for a further use in parse apparently.
(in R3)
Pekr
26-Oct-2009
[4568x2]
:-)
I thought it is not implemented yet, hence no reservation?
Steeve
26-Oct-2009
[4570]
if you just try to use it, your parsing may crash. So, it's doing 
nothing but it's here.
Pekr
26-Oct-2009
[4571x2]
Hmm, you are right .... But we might need better error message, no?

>> test: ["123"] parse "123" [test]
== true

>> limit: ["123"] parse "123" [limit]
** Script error: PARSE - invalid rule or usage of rule: end!
** Where: parse
** Near: parse "123" [limit]
posted to Chat/R3/Parse group ...
BrianH
26-Oct-2009
[4573x2]
Keywords that are *planned* to be added should definitely be reserved.
Otherwise adding them would be difficult.
Steeve
26-Oct-2009
[4575]
But it should return a proper error message as Pekr noticed it.
BrianH
26-Oct-2009
[4576]
Agreed :)
Robert
8-Nov-2009
[4577x2]
I have used www.antlr.org stuff several years ago with C/C++ target. 
It's a very cool parser generator toolkit. Just took a look again. 
It has emitters for different languages. Maybe one of the parse gurus 
here can take a look if we can do a REBOL emitter.
IMO that would be really nice.
JoshF
17-Nov-2009
[4579x4]
Hi! I'm trying to use REBOL's parse to make a simple calculator dialect. 
However, I'm having trouble with escaping entities (I think)...  
Here's my first try (that worked):
>> parse [3 + 2] [some [integer! (print "number") | ['+ | '- ] (print 
"op")]]
number
op
number
== true
>> parse [3 - 2] [some [integer! (print "number") | ['+ | '- | '* 
| '/ ] (print "op")]]
** Syntax Error: Invalid word-lit -- '

** Near: (line 1) parse [3 - 2] [some [integer! (print "number") 
| ['+ | '- | '* | '/
 ] (print "op")]]
The second one failed when I tried to extend the dialect with multiply 
(*) and divide (/). After further experimentation, it seems that 
you can't escape the "/". Google has not been helpful here... Does 
anybody have any ideas? I could parse for just a word! instead of 
the +, -, etc., but I wanted parse to do the work of deciding what 
was a valid operation or not. Sorry for the multiple messages, I'm 
still trying to figure this client out... Thanks for any advice!
Ladislav
17-Nov-2009
[4583]
JoshF: Rebol load does not parse the '/, but you can do:

as-lit-word: func ['word [any-word!]] [to lit-word! word]
lit-div: as-lit-word /

parse [3 - 2] [some [integer! (print "number") | ['+ | '- | '* | 
lit-div] (print "op")]]
JoshF
17-Nov-2009
[4584x2]
Ha! Black magic! That works a champ Ladislav, thanks very much!  
I had tried 
>> tdiv: to-word "/"
== /

>> parse [3 / 2] [some [integer! (print "number") | ['+ | '- | '* 
| tdiv ] (print "op
)]]
But had gotten the same error. What makes yours work?
Both tdiv and lit-div type? to a word!...
Ladislav
17-Nov-2009
[4586x2]
My example works, since the LIT-DIV variable refers to a lit-word, 
while your tdiv refers to a word
check as follows:

type? :lit-div
type? :tdiv
Henrik
17-Nov-2009
[4588x2]
If LOAD won't eat a block, PARSE won't either, so you can test your 
block with LOAD. Some words can't be typed directly in, hence ladislav's 
solution.
And also hence the expression "a block is or isn't loadable"
JoshF
17-Nov-2009
[4590]
OK... Mechanically, I see what you're saying, but what's the difference 
between a lit-word and a word? The spirit eludes me...
Ladislav
17-Nov-2009
[4591]
just a different datatype
JoshF
17-Nov-2009
[4592x2]
I thought there was only word!'s and then everything else were more 
concrete types. I guess what I am asking is what is the purpose of 
lit-words?
Or are they just used for the special case of dealing with a / in 
load? ;  - )
Ladislav
17-Nov-2009
[4594x2]
in Parse, lit-words are used for matching, while words are looked 
up for values, which then are used for matching, so totally different 
behaviour
Compare:
>> parse [a] [a]
** Script Error: a has no value
** Near: parse [a] [a]
>> parse [a] ['a]
== true
Henrik
17-Nov-2009
[4596]
I think you can say, that a word can be an evaluated lit-word. When 
you are typing a word directly into the console, you evaluate the 
word into a value that it's bound to. When entering a lit-word, it's 
evaluated into a word.