r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Brock
24-Nov-2008
[3263x2]
I've been hoping for these publicly for years... but you won't get 
them.  Competitive advantage, and you have to respect that.  Although, 
if the companies were left nameless and so was the author, then maybe 
he could atleast outline the concept that was involved or the problems 
that were overcome.
The impact wouldn't be as big, but it would definitely hope show 
the industrial strength of Rebol.
amacleod
24-Nov-2008
[3265]
100% rebol platform? What are we talking about here?
eFishAnt
24-Nov-2008
[3266x2]
I have another month or so before deployment.  Still lots of busy 
work, mostly more parse rules.  I probably shouldn't say much yet, 
but I was excited after a weekend of breakthroughs, and anyway, I 
wanted people to know I am even more heavily invested in REBOL than 
ever.
The {^}^{^}^{^}} problem went away by thinking about it differently, 
so now Javascript doesn't bite me any more...
Steeve
24-Nov-2008
[3268x2]
some hints ?
or u just arousing us
eFishAnt
24-Nov-2008
[3270]
a month or few and I should be ready to say more.  I just got too 
excited...;-)
Steeve
24-Nov-2008
[3271]
shame on u ;-)
Davide
3-Dec-2008
[3272]
How can I parse into a paren! ?
I've tried: 
parse [(a b)] [paren! into [word! word!]]

but It doesn't work
Maarten
3-Dec-2008
[3273x3]
>> data: [( a b )]
== [(a b)]
>> parse data [ into [set w word! set 
v word!] (print [ v w])]
b a
== true
Silly formatting on macbook paste, but you get the idea. Leave the 
paren! match out of your rule:
i.e rule: [ into [ set v word! set w word!] (print [ v w])]
Davide
3-Dec-2008
[3276x2]
Ok, the paren! consume the input, but how can I parse only paren! 
?
your rule work for (a b) and for [a b]

>> parse [[a b]] rule
a b
== true
>> parse [(a b)] rule
a b
== true
...or better, do you know where I can find a simple algebra parse 
script, which can compute simple expression like [1 + (2 /3)] ?
I'm learning how to parse, and such example would be illuminant
Dockimbel
3-Dec-2008
[3278x2]
>> rule: [mark: paren! :mark into [ set v word! set w word!] (print 
[ v w])]
>> parse [[a b]] rule
== false
>> parse [(a b)] rule
a b
== true
There's a nice example here : http://www.rebol.com/docs/core23/rebolcore-15.html#section-6, 
but it requires a string! value as input. You could adapt it to use 
a block! value as input.
Oldes
3-Dec-2008
[3280x2]
Try to play with this: http://oldes.multimedia.cz/rebol/expr-test.r
or:
http://oldes.multimedia.cz/rebol/expr-test2.r
http://oldes.multimedia.cz/rebol/expr-test3.r
Davide
3-Dec-2008
[3282x3]
thanks ! I'm going to study it
ok this is my first attempt for a basic expression parser

stack: copy []
push: func [x] [insert stack x]
pop: does [take stack]
term: [
	pos: paren! :pos into [expr] (push rejoin ["(" pop ")"]) |
	set t string! (push mold t ) | 

 set sign opt ['+ | '-] set t number! (if none? sign [sign: '+] push 
 mold (t * to-integer join sign "1")) | 

 set gp path! (push rejoin [{$('#} first gp {').attr('} second gp 
 {')} ]) 
]
oper: compose ['+ | '- | '* | (to-lit-word "/")]
operrule: [set o oper (push o)]

expr: [term any [operrule term (t2: pop o: pop t1: pop push rejoin 
[t1 o t2])]]


I'm sure it's not optimized nor correct, so any correction is welcome.
it can parse expression like: [- 1 * (a/b + 2)] and it emits a javascript 
compatible string.
(is a part of a comet server written in rebol)
Oldes
4-Dec-2008
[3285x3]
instead of:

set sign opt ['+ | '-] set t number! (if none? sign [sign: '+] push 
mold (t * to-integer join sign "1")) |
I would use something like:
    set t number! (push mold t ) |
 '- set t number! (push mold negate t ) |

(Which is not perfect as well)
because the above will not work for expressions like: [ - (1 + 1)]
also I'm not sure, if the mold is needed
Brock
4-Dec-2008
[3288x5]
I am having some parse difficulties.  I have the below code...
records: read/lines http://www.geocities.com/[kalef-:-rogers-:-com]/samples.txt
lang-rule: [
	thru "language%3a" copy language to "+%2b" |
	thru "language%3a" copy language to "&resultview" |
	thru "language%3a" copy language to "&"
	;thru "language="   copy language to "&" |
]
rec: 1
foreach record records[
	language: copy ""
	parse record [lang-rule]
	print [rec tab language]
	rec: rec + 1
]
in the parse rule  lang-rule.  It seems that only two of the three 
rules work at any one time with the data I have loaded. (note the 
last rule is commented out)
If I change the order of the rules and place the pipe or bar '|' 
as appropriate, I can't get all three of these rules to work together.
I expect the output to be either the text 'en' or 'fr'  (for english 
or french) for each record, however records 52 & 66 the parse is 
not ending properly for the current setup.  If you change the order 
of the rules, then other records will not work as expected.   Any 
ideas?
Oldes
4-Dec-2008
[3293]
lang-rule: [
	thru "language%3a" copy language ["en" | "fr"] to end
]
Chris
4-Dec-2008
[3294]
You could dehex first?  That would make things more consistent...
Brock
4-Dec-2008
[3295x2]
thanks for the suggestions.  I never thought of using the actual 
text I was expecting as an option like you suggested Oldes.
Okay Oldes, your solution works, but why does my code fail, any ideas? 
 I have other scenarios that follow this same sort of structure but 
do not have a simple two word expected result.  I've been able to 
handle these so far, simply by changing the order and moving the 
'stop' word of the Ampersand to the bottom of the rule options.  
[I'm trying Chris's option now]
Davide
4-Dec-2008
[3297]
I'm here again ! 

Is possible to write a rule that match any word! but those that are 
in a block ?
example:

list: ['for 'while 'show]
parse block [any [word! *NOT IN* list]]

If there's no a direct approach, is there a workaround ?
Oldes
4-Dec-2008
[3298x6]
Brock: it'sbecause in rows 52 and 66 you find "+%2b".It'snot next 
to lang value you want, but it's there so the first rule is true.
You can use this for any lang-id with 2 chars:
lang-chars: charset [#"a" - #"z"]
lang-rule: [
	thru "language%3a" copy language 2 lang-chars to end
]
I would not use dehexas with dehex you parse the data twiceso it 
must be slower.
You can use this if there may be url-encoded data and not encoded 
as well:
lang-chars: charset [#"a" - #"z"]
lang-rule: [
	thru "language" ["%3a" | "="] copy language 2 lang-chars to end
]
btw. you can also  use:
parse "language=xx" lang-rule
Davide.. in R2:
list: ['for | 'while | 'show]
parse [for each while end] [any [list | set w word! (probe w)]]
In R3 it should be better I think.
Chris
4-Dec-2008
[3304]
dehex, agreed -- until the speed trade off becomes worth it.  Which 
may be if you are wanting to get more than just the language from 
the string.
Brock
4-Dec-2008
[3305x2]
Oldes, thanks for ripping into that with so many options.  To your 
first response - of course, didn't notice that.  Your last response 
is not obvious to me what that does, I'll need to look at that more.
Dehex isn't an issue for me really.  I am only taking  a very small 
percentage of records.  So in the big picture, it's not a significant 
slow-down.  The process this is attached to runs daily on a group 
of text files totalling less than 10 MB in size.
Jerry
6-Dec-2008
[3307]
I was pasing something. I got this:
 ** Script Error: Internal limit reached
I still don't know what's wrong. Anybody?
sqlab
6-Dec-2008
[3308]
Too many recursions.
Maybe the rule is too complex or you get an infinte loop.
Just show your rule and the problem.
For sure someone will help.
Davide
6-Dec-2008
[3309]
Thanks Oldes, I've tried your hint, but it made my parse rule too 
complex, so I've used an additional word as discriminant.
Hope that R3 will improve this.
Jerry
7-Dec-2008
[3310]
sqlab, your are right. There is an infinite loop. I am fixing it. 
It's a C++ parser, so the rule is very complicated.
Maxim
24-Dec-2008
[3311x2]
paul asked: "Question for you regarding parse.  How do you force 
parse to return false immediately without processing the rest of 
the string or block one it evaluates a paren?"
example: 


parse/all s [some ["12345" here: (print "*" here: tail here) :here 
skip]]


basically, in the paren, you assing the tail of the data being parsed, 
and force parse to move to it, then try going beyond....  the skip 
makes it return false, otherwise it returns true.