r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Chris
22-Oct-2009
[4555x3]
Not size, efficiency.
Allowing 'into to look inside strings can break current usage of 
'into, requiring [and any-block! into ...]
An example: a nested d: [k v] structure where 'k is a word and 'v 
is 'd or any other type:

	data: [k [k "s"]]

R2, you can validate with d: [word! [into d | skip]]


Now you have to specify: d: [word! [and any-block! into d | skip]] 
otherwise you get an error if 'v is a string!
Sunanda
25-Oct-2009
[4558]
I guess parse can do this too?

   http://stackoverflow.com/questions/1621906/is-there-a-way-to-split-a-string-by-every-nth-seperator-in-python
Will
25-Oct-2009
[4559]
is R2/Forward available for download? thx
Geomol
25-Oct-2009
[4560x2]
Sunanda, one way:

>> out: clear []

>> parse "this-is-a-string" [mark1: any [thru "-" [to "-" | to end] 
mark2: (append out copy/part mark1 mark2) skip mark1:]]
>> out
== ["this-is" "a-string"]
Another:

>> out: parse "this-is-a-string" "-"
>> forall out [change/part out rejoin [out/1 "-" out/2] 2]
>> out
== ["this-is" "a-string"]
Steeve
25-Oct-2009
[4562]
R3 one liner ;-)

>> map-each [a b] parse "this-is-a-string" "-" [ajoin [a #"-" b]]
Graham
26-Oct-2009
[4563]
Rebol doesn't have lines :)
BrianH
26-Oct-2009
[4564x2]
Chris, there can be an advantage in R3 to breaking up a bitset into 
more that one bitset on occasion, mostly memory savings. However, 
it might not work as well as you might like since offset and/or sparse 
bitsets aren't supported. Bitsets that involve high codepoints will 
take a lot of RAM no matter what you do.
Will, R2/Forward is already available for download in DevBase (R3 
chat). It is a little outdated though, since I had to take a break 
to rewrite R3's module system. I'll catch up when I get the chance. 
The percentage of R3 that I can emulate has gone down drastically 
since the last update, since R3 has made a lot of changes to basic 
datatype behavior since then. We'll see what we can do.
Steeve
26-Oct-2009
[4566x2]
Something funny.

I spent an hour debugging a parsing rule. 
To finally understand this.  
Never name a rule, LIMIT. 
LIMIT keyword is reserved for a further use in parse apparently.
(in R3)
Pekr
26-Oct-2009
[4568x2]
:-)
I thought it is not implemented yet, hence no reservation?
Steeve
26-Oct-2009
[4570]
if you just try to use it, your parsing may crash. So, it's doing 
nothing but it's here.
Pekr
26-Oct-2009
[4571x2]
Hmm, you are right .... But we might need better error message, no?

>> test: ["123"] parse "123" [test]
== true

>> limit: ["123"] parse "123" [limit]
** Script error: PARSE - invalid rule or usage of rule: end!
** Where: parse
** Near: parse "123" [limit]
posted to Chat/R3/Parse group ...
BrianH
26-Oct-2009
[4573x2]
Keywords that are *planned* to be added should definitely be reserved.
Otherwise adding them would be difficult.
Steeve
26-Oct-2009
[4575]
But it should return a proper error message as Pekr noticed it.
BrianH
26-Oct-2009
[4576]
Agreed :)
Robert
8-Nov-2009
[4577x2]
I have used www.antlr.org stuff several years ago with C/C++ target. 
It's a very cool parser generator toolkit. Just took a look again. 
It has emitters for different languages. Maybe one of the parse gurus 
here can take a look if we can do a REBOL emitter.
IMO that would be really nice.
JoshF
17-Nov-2009
[4579x4]
Hi! I'm trying to use REBOL's parse to make a simple calculator dialect. 
However, I'm having trouble with escaping entities (I think)...  
Here's my first try (that worked):
>> parse [3 + 2] [some [integer! (print "number") | ['+ | '- ] (print 
"op")]]
number
op
number
== true
>> parse [3 - 2] [some [integer! (print "number") | ['+ | '- | '* 
| '/ ] (print "op")]]
** Syntax Error: Invalid word-lit -- '

** Near: (line 1) parse [3 - 2] [some [integer! (print "number") 
| ['+ | '- | '* | '/
 ] (print "op")]]
The second one failed when I tried to extend the dialect with multiply 
(*) and divide (/). After further experimentation, it seems that 
you can't escape the "/". Google has not been helpful here... Does 
anybody have any ideas? I could parse for just a word! instead of 
the +, -, etc., but I wanted parse to do the work of deciding what 
was a valid operation or not. Sorry for the multiple messages, I'm 
still trying to figure this client out... Thanks for any advice!
Ladislav
17-Nov-2009
[4583]
JoshF: Rebol load does not parse the '/, but you can do:

as-lit-word: func ['word [any-word!]] [to lit-word! word]
lit-div: as-lit-word /

parse [3 - 2] [some [integer! (print "number") | ['+ | '- | '* | 
lit-div] (print "op")]]
JoshF
17-Nov-2009
[4584x2]
Ha! Black magic! That works a champ Ladislav, thanks very much!  
I had tried 
>> tdiv: to-word "/"
== /

>> parse [3 / 2] [some [integer! (print "number") | ['+ | '- | '* 
| tdiv ] (print "op
)]]
But had gotten the same error. What makes yours work?
Both tdiv and lit-div type? to a word!...
Ladislav
17-Nov-2009
[4586x2]
My example works, since the LIT-DIV variable refers to a lit-word, 
while your tdiv refers to a word
check as follows:

type? :lit-div
type? :tdiv
Henrik
17-Nov-2009
[4588x2]
If LOAD won't eat a block, PARSE won't either, so you can test your 
block with LOAD. Some words can't be typed directly in, hence ladislav's 
solution.
And also hence the expression "a block is or isn't loadable"
JoshF
17-Nov-2009
[4590]
OK... Mechanically, I see what you're saying, but what's the difference 
between a lit-word and a word? The spirit eludes me...
Ladislav
17-Nov-2009
[4591]
just a different datatype
JoshF
17-Nov-2009
[4592x2]
I thought there was only word!'s and then everything else were more 
concrete types. I guess what I am asking is what is the purpose of 
lit-words?
Or are they just used for the special case of dealing with a / in 
load? ;  - )
Ladislav
17-Nov-2009
[4594x2]
in Parse, lit-words are used for matching, while words are looked 
up for values, which then are used for matching, so totally different 
behaviour
Compare:
>> parse [a] [a]
** Script Error: a has no value
** Near: parse [a] [a]
>> parse [a] ['a]
== true
Henrik
17-Nov-2009
[4596]
I think you can say, that a word can be an evaluated lit-word. When 
you are typing a word directly into the console, you evaluate the 
word into a value that it's bound to. When entering a lit-word, it's 
evaluated into a word.
JoshF
17-Nov-2009
[4597]
OK... So, let me paraphrase... As far as REBOL is concerned, lit-words 
are used only by the parse dialect to represent a thing to match 
to, whereas words are evaluated to find the thing to match to. However, 
because of parsing constraints in REBOL as a whole (the significance 
of "/" when dealing with indexable variables), there's no way to 
"escape" the slash into an unevaluated (literal) word without the 
dodge you showed me.
Ladislav
17-Nov-2009
[4598]
right
JoshF
17-Nov-2009
[4599]
OK... Thanks very much. That helps a lot. I was right down the road 
to writing an expression parser, then that whole slash thing stopped 
me dead in my tracks. Now I should be able to get into some _real_ 
trouble!
Henrik
17-Nov-2009
[4600]
a trap that you might fall into:

type? first [none]
== word!

type? first reduce [none]
== none!

type? first reduce ['none]
== word!
Ladislav
17-Nov-2009
[4601]
...except for the fact, that lit-words are used in the Do dialect 
(= when Rebol is concerned, as you say), when you want to write an 
expression, which evaluates to a specific word, so, e.g. the expression:

'a

evaluates to the same value as the expression:

first [a]

, which happens to be the word A
Pekr
17-Nov-2009
[4602]
http://www.rebol.com/docs/core23/rebolcore-15.html#section-6
Henrik
17-Nov-2009
[4603]
Depending on the situation, it can be hard to tell whether you are 
dealing with a word or a specific value. that's the price for freely 
interchangable code/data. :-)

a: [none]

b: copy a

b: reduce b ; me doing this behind your back

a
== [none] ; word!

b
== [none] ; none!
Pekr
17-Nov-2009
[4604]
it is a bit difficult to understand recursive rules, but :-)