r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Henrik
20-Aug-2010
[5110]
because the R3 behavior is correct. you are not parsing spaces in 
the above example.
Gregg
20-Aug-2010
[5111]
Should the docs say that a block! rule implies /all?
JoshF
1-Sep-2010
[5112]
Hi! Quick question about parsing REBOL code itself... I'm putting 
together an entry for a contest which is line-constrained (no more 
than 250 SLOC), so I want to crush my code down as much as possible 
while still having something that actually looks like code (I know 
about using the compression, but I want something that looks like 
a program).


I'm starting with Carl's REBOL parser from the cookbook, but it seems 
to skip the colons for initializers ("x: x + 1" -> [x x + 1]). Here's 
my current hack of his parser:

tokens: copy []
parse read %contest-entry.r blk-rule: [
    some [
        str:
        newline |

        #";" [thru newline | to end] new: (probe copy/part str new) |
        [#"[" | #"("] (append tokens str/1) blk-rule |
        [#"]" | #")"] (append tokens str/1) break |

        skip (set [value new] load/next str  append tokens :value) :new
    ]
]


Any ideas why it might be skipping the vital ":" character? Thanks 
very much!
Anton
1-Sep-2010
[5113]
What version of Rebol are you using?
Seems to work ok for me in R2.
What's the input which fails?
Gregg
1-Sep-2010
[5114]
Works for me too.
JoshF
1-Sep-2010
[5115]
Hi! Thanks for taking a look at the code. 


I went over it again, it seems that part of the problem was in the 
fact that the parsed objects weren't transliterated into strings 
as I had expected. I.e. if you look at the output of the code snippet 
above, it seems OK, but examination of the types of the data in the 
tokens array turn up things that don't convert to strings too well 
without help. I've puzzled over Carl's pretty printer, and I _think_ 
I understand why now... 


Either way, I was able to modify it to give me the kind of output 
I wanted. To repay you for your kind attention, I will post my code 
here, but in crushed form, so it doesn't take up too much space... 
; - )


REBOL [ Title: "REBOL Compressor" ] emit-space: func [ pos ] [ append 
out pick

[ #" " "" ] found? not any [ find "[(" last out find ")]" first pos 
] ] emit:

func [ from to ] [ emit-space from word: copy/part from to long: 
( length? out )

+ length? word if 80 < long [ append lines out out: copy "" ] append 
out
copy/part from to ] lines: copy [ ] clean-script: func [
Returns new script text with standard spacing.
 script "Original Script text"

/local str new ] [ out: append clear copy script newline parse script 
blk-rule:

[ some [ str: some [ newline ] ( ) | #";" [ thru newline | to end 
] new: ( ) | [

#"[" | #"(" ] ( emit str 1 ) blk-rule | [ #"]" | #")" ] ( emit str 
1 ) break |

skip ( set [ value new ] load/next str emit str new ) :new ] ] append 
lines out

remove lines/1 print [ length? lines "lines." ] lines ] write/lines 
%crushed.r
clean-script read %c.r print read %crushed.r

Thanks!
Gregg
2-Sep-2010
[5116]
If you note that load/next is used when values are parsed, you can 
see why values aren't strings. MOLD can be your friend as FORM (and 
PRINT) will hide datatype details from you. e.g.

>> print first [x]
x
>> print first [x:]
x
>> print first ['x]
x
>> print first [:x]
x
Maxim
2-Sep-2010
[5117]
which is why its a good habit to use use probe instead of print in 
most cases where you trace data
Anton
2-Sep-2010
[5118]
JoshF, if this script stands alone then I would make these changes:

- Add as locals to EMIT: WORD and LONG.

- Add to CLEAN-SCRIPT's locals: LINES, OUT, EMIT-SPACE, EMIT, BLK-RULE, 
VALUE
- Move into CLEAN-SCRIPT's body:
	1	lines: copy []
	2	The EMIT-SPACE function
	3	The EMIT function
- Change this line:
	out: append clear copy script newline 
to:
	out: copy ""

(There's no point copying the string SCRIPT when the next thing you 
do is CLEAR it.)
- Remove this line:
	remove lines/1

(There seems no point in initializing OUT with a single newline char 
if it is only to be removed ultimately.)


After, that, you should have only one word, CLEAN-SCRIPT, defined 
globally, referring to a function with no side-effects.
Fork
5-Sep-2010
[5119]
I couldn't figure out how to make DO work in Parse.  My answer here 
shows an example that I tried: http://stackoverflow.com/questions/3478589/rebol-parse-problem
BrianH
5-Sep-2010
[5120x3]
R3's PARSE DO operation only works with block parsing, not string 
parsing.
>> parse [2] [do (1 + 1)]
== true
Given your example, you wouldn't want to use DO anyhow since you 
would be creating a charset in a loop, creating a new charset for 
each iteration. It is better to create it once ahead of time.
Fork
5-Sep-2010
[5123x2]
Hrrrm.  I haven't messed with block parsing much, but it breaks any 
obvious intuition that >> parse [[2]] [do [(1 + 1)]] is true and 
>> parse [2] [do [(1 + 1)]] is also true.  :-{
Is there a foundational reason for DO not being available for string 
parsing, or is it just not implemented?
Micha
5-Sep-2010
[5125]
how to set local vars in parse ?

rule: [ "text" (var: "local") ]

var: "global"

f: func [ /local var ] [parse "test" rule  return var ]


f  ; result = none  not "local"

what to do get  result = "local"
Nicolas
5-Sep-2010
[5126]
;Does this help?
rule: [ "text" (var: "local") ]
var: "global"

f: func [ /local var ] [var: "local" parse "test" rule  return var 
]
f
Graham
5-Sep-2010
[5127x2]
f: has [ var rule ][
	var: none
	rule: [ "text" end (var: copy "local" ) ]
	parse/all [ "text" ] rule 
	var 	
]
you're using a block parse rule to parse a string .. so we switched 
to using a data block to parse
Nicolas
5-Sep-2010
[5129]
;This also works
var: "global"
f: has [var] [
	rule: [ "test" (var: "local") ]
	parse "test" rule
	var
]
f
Steeve
5-Sep-2010
[5130x2]
Actually, DO is really easy to simulate, both in R2 and R3.
Just construct the rule on the fly.
>> parse [2][(rule: do [1 + 1]) 1 1 rule]
==true
Remember, all new commands in R3 can be emulated in R2.
Micha
5-Sep-2010
[5132]
and what if rule ic create dynamic form file in global words  and 
can not by  create in functions ?
Steeve
5-Sep-2010
[5133]
sorry ?
Anton
5-Sep-2010
[5134x2]
I think he meant to ask "and what if RULE is created dynamically 
(ie. loaded from a file) and thus its words are global, and are not 
 (or cannot be) created by functions?"
Nicolas, your example does not work:
Steeve
5-Sep-2010
[5136]
ah ok thanks for the translation.
BIND and BIND? are the keys
Anton
5-Sep-2010
[5137]
rule: [ "text" (var: "local") ]
var: "global"

f: func [ /local var ] [var: "funclocal" parse "text" rule  return 
var ]
f ;== "funclocal"
var ;== "local"
Ladislav
5-Sep-2010
[5138]
It is quite hard to decipher what actually Micha meant, I suppose, 
he wanted this?

rule: ["text" (var: "local")]

var: "global"

f: func [/local var] [parse "text" bind rule 'var return var]
Micha
5-Sep-2010
[5139x2]
ok thanx
this code  good works
Anton
5-Sep-2010
[5141x2]
Nicolas, I refer to your first example.

The error is that FUNC binds the words in its body block to its context, 
but this binding does not extend to reaching inside the block referred 
to by the RULE word.

This error might have arisen because of a small typo, parsing "text", 
not "test", which RULE matches.
("arisen" -> "survived")
BrianH
5-Sep-2010
[5143x4]
Fork, I didn't know about the paren-in-a-block form of the DO parameter. 
That is weird. I can't figure out why that form would be supported 
- it wasn't in the proposal.
Is there a foundational reason for DO not being available for string 
parsing, or is it just not implemented?

There are a lot of things that you can do in block parsing that you 
can't in string parsing. In this case, the result of DO is compared 
directly as a REBOL value. Strings don't directly contain REBOL values 
the way that blocks do. Even if you tried to limit the result types 
of the expression and trigger an error if they don't match, what 
you are left with isn't useful enough to justify adding it, imo. 
For instance, in your example it was a bad idea to use DO. We'll 
see though.
Micha, there was a direct solution proposed for this in the parse 
proposals, specifically to deal with local variables in recursive 
parse rules. However, it turns out that PARSE isn't really recursive: 
It fakes it. So there was no way to support this feature in a parse 
directive. The best way to do the local variables is to put the PARSE 
call and the rules in a function, and if you have to use recursive 
rules, recursively call that function in an IF (...) operation. It 
really works well, in a roundabout sort of way.
Nicolas, in R3 especially it is better to directly put the rules 
in the function, rather than refer to external rules. The DECODE-URL 
method is really crappy in R3 because it isn't recursion-safe or 
task-safe. (Reminder: We must fix that for R3 when we go over the 
mezzanines for task-safety.)
Maxim
5-Sep-2010
[5147x2]
micha, you can also just push and pop values in a block you use like 
a stack.  you push before setting to a variable, you pop after the 
rule.

you just have to make sure to only push/pop once a complete rule 
is matched.  meaning you handle that in a paren at the END of the 
rule.
there are a few example floating around, you should find one if you 
google it.
Anton
6-Sep-2010
[5149]
Remember that Micha's English isn't good. I don't think he can understand 
what you guys are saying without a lot of effort in translation. 
It might be better to try to make your points in code.
Fork
6-Sep-2010
[5150]
@BrianH: If you're string parsing, couldn't it run to-string on the 
result of DO?
Ladislav
8-Sep-2010
[5151]
The best way to do the local variables is to put the PARSE call and 
the rules in a function, and if you have to use recursive rules, 
recursively call that function in an IF (...) operation. It really 
works well, in a roundabout sort of way.
 - this is too much of a roundabout for most cases, I have to add
BrianH
10-Sep-2010
[5152x2]
Fork, that won't work on some return types, and that would lead to 
runtime errors. But in theory, yes.
Ladislav, true. But since PARSE doesn't really recurse, the only 
direct way to have local variables would be to BIND/copy the parse 
rules for each level of recursion. Doing the function recursion method 
is actually more efficient and easier than that.
Ladislav
11-Sep-2010
[5154]
I guess, that it is the time to propose a reasonable and efficient 
method
Ladislav
13-Sep-2010
[5155]
I defined a USE-RULE function yielding a rule with local variables. 
Now I wonder where to publish it.
Gregg
13-Sep-2010
[5156]
REBOL.org, or perhaps your own site for now? Or as a wiki page linked 
from PARSE docs?
Ladislav
13-Sep-2010
[5157]
http://en.wikibooks.org/w/index.php?title=REBOL_Programming/Language_Features/Parse&stable=0#USE_rule
Gregg
13-Sep-2010
[5158]
Thanks Ladislav. 

What do 'fni and 'fnii stand for?


I would certainly add a comment or doc string that USE-RULE is recursive/thread 
safe, which is why it's not much simpler.
Ladislav
13-Sep-2010
[5159]
FNI is just "fixed guts" of the CONTEXT-FN put in the CONTEXT-FN 
function body as a function to not intefere with the context. FNII 
is "dynamic guts" of CONTEXT-FN. By assigning different functions 
to the FNII variable we adjust what the CONTEXT-FN is actually doing.