r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Pekr
19-Jul-2006
[1376x2]
I now can create simply a func, which will accept mark name, and 
do some code-block accordingly - sql query, simple replace of value, 
whatever (well, it will not work for cases like img tags, so it is 
not as flexible as full html parser in temple for e.g., but hey, 
it is meant being simple)
... but should not be simpler, so I wonder - so far, as you can see, 
mark-x is not finished, so it is ignored. How to catch this case 
properly and eventually generate error, send email, write to log, 
whatever?
Anton
19-Jul-2006
[1378x2]
I did that recently for COMLib. Look for  build-comlib-website.r 
 in the files section of the COMLib website.
(my end tags are simpler though)
Pekr
19-Jul-2006
[1380x2]
ok, will look into. Is that much complicated than mine is? :-)
this one works better for me:

parse/all template [
   some [
         thru "<!--["
         copy mark to "]-->"
         "]-->" 
         start:
         copy text to "<!--/["
         end:
         "<!--/[" 

         [mark "]-->" (print text) | (print ["not found end of: " mark]) :start]
         |
         skip
    ]
]
Maarten
19-Jul-2006
[1382]
Petr.... you just reinvented erebol, rsp, .... build-markup?
Volker
19-Jul-2006
[1383]
But it does the job, as far as i read.
Pekr
19-Jul-2006
[1384x5]
Maarten - not reinvented - rsp, iirc, mixes code of rebol with html, 
which I don't want to allow
template has to be displayable as gfx man does it, with temporary 
text, images, whatever .... I am just supposed to get correct data 
in there ....
Maarten - now looking into build-markup - sorry, it is just strange 
was of doing things .... noone will place rebol code into template, 
that will not work ... btw - the code is 'done? What happens if someone 
uploads template with its own code? I want presentation and code 
separation.
I looked into rsp some time ago, and I liked it, especially as it 
was complete, with session support etc., but later on I found shlik.org 
being unavailable ...
Maarten  - is RSP still available anywhere for download? erebol.com 
nor shlik.org do work ....
Chris
19-Jul-2006
[1389]
Petr, I have a copy with some notes here:
http://www.ross-gill.com/techniques/rsp/
Ladislav
31-Aug-2006
[1390]
I think, that Tim Peters knew what he wrote here: http://mail.python.org/pipermail/python-dev/1999-December/001770.html
JaimeVargas
31-Aug-2006
[1391]
Very nice comments. But comparing a parser with a regex is a bit 
unfair ;-)
BrianH
31-Aug-2006
[1392x2]
Yeah, for either :)


Still, Peters seems to think that REBOL parse rule blocks are closures, 
for various reasons that are clear from context.
They aren't, though something like parse closures has been suggested 
during the latest round of enhancement proposals.
Volker
31-Aug-2006
[1394]
They are quite like smalltalk-closures. Without locals, but locals 
are not his point. (he may miss them for recursion).
BrianH
31-Aug-2006
[1395]
Hey, locals and arguments (practically the same thing in REBOL) are 
the most important difference between closures and plain blocks. 
The difference is significant but Peters' background with Smalltalk 
made him miss it - Smalltalk "blocks" look like REBOL blocks but 
act like functions.
Volker
31-Aug-2006
[1396x4]
No, the main point is, easy definitions of code and referencing the 
original context. Rebol-blocks do that.
You can  have closures without any arguments.
The highlights he mentions is: lexically-scoped, code and data,  
freely mix computations in
That scoping is the difference between a closure and doing a "string" 
here.
BrianH
31-Aug-2006
[1400]
REBOL blocks don't reference a context, but they may contain words 
that reference a context. Still, this distinction makes no difference 
to the argument that Peters was making - REBOL text processing is 
more powerful than regex and easier to use. It would be easier to 
replicate REBOL-style parsing in Python using closures and generators 
anyway (Peters' real subject), since that is the closest Python gets 
to Icon-style backtracking.
Volker
31-Aug-2006
[1401x3]
its not important what references the context, but that a variable 
can find one.
result := a > b
    ifTrue:[ 'greater' ]
    ifFalse:[ 'less' ]
There are two closures here. Rebol could do it the same way.
Ladislav
31-Aug-2006
[1404]
besides, Tim was a REBOL 1.x user
Gabriele
1-Sep-2006
[1405]
didn't parse come with 2.0?
Ladislav
1-Sep-2006
[1406]
it did, so it looks, that Tim is still silently watching REBOL
Gabriele
2-Sep-2006
[1407]
most probably he didn't leave as soon as we think :) i think i remember 
him on the list in 2.0 times too.
Ladislav
2-Sep-2006
[1408]
seems you are right
Oldes
15-Sep-2006
[1409]
Maybe someone find it usefull:

remove-tags: func[html /except allowed-tags /local new x tag name 
tagchars][
	new: make string! length? html
	tagchars: charset [#"a" - #"z" #"A" - #"Z"]
	parse/all html [
		any [
			copy x to {<} copy tag thru {>}  (
				if not none? x [insert tail new x]
				if all [
					except
					parse tag ["<" opt #"/" copy name some tagchars to end]
					find allowed-tags name
				][	insert tail new tag ]
			)
		]
		copy x to end (if not none? x [insert tail new x])
	]
	new
]
Geomol
24-Sep-2006
[1410]
Wouldn't it be nice, if the /case refinement for parse also worked 
with words, when parsing blocks?

>> parse [aBc] ['abc]
== true
>> parse/case [aBc] ['abc]
== true

It should work like when parsing strings, I think!

>> parse "aBc" ["abc"]
== true
>> parse/case "aBc" ["abc"]
== false
JaimeVargas
24-Sep-2006
[1411]
Why? words are not case sensitive by definition.
Geomol
25-Sep-2006
[1412]
I would like the functionality, when parsing things like TeX. There 
the greek letter gamma is called gamma, and the same in capital is 
called Gamma. Now I have to invent the word capgamma or something.
Gabriele
25-Sep-2006
[1413]
>> parse ["Gamma"] ["gamma"]
== true
>> parse/case ["Gamma"] ["gamma"]
== false
Gregg
25-Sep-2006
[1414]
If it were a safe and easy thing to change, I can see some value 
in it as an option but, since words--and REBOL--are case insensitive, 
I'm inclined to live with things as they are, and use string parsing 
if case sensitivity is needed. I think it's Oldes or Rebolek that 
sometimes requests the ability to parse non-loadable strings, using 
percentage values as an example. I think loading percentages would 
be awesome, but then there are other values we might want to load 
as well; where do you draw the line? I'm waiting to see what R3 holds 
with custom datatypes and such.
Oldes
25-Sep-2006
[1415]
Yes, it's me who is calling to add posibility to load anything what 
is now throwing invalid datatype error.
Gregg
25-Sep-2006
[1416x2]
And didn't you suggest that values throwing errors could be coerced 
to string! or another type? e.g. add an /any refinement to load, 
and any value in the string that can't be loaded would become a string 
(or maybe you could say you want them to be tags for easy identification).
I'm not sure how custom datatype lexing would work, unless it did 
something similar, calling custom lexers when running up against 
values the standard lexer doesn't understand. I can't remember how 
Gabriele's custom type mezz code works either; need to look at that.
Oldes
25-Sep-2006
[1418x3]
I think, load/next can be used to handle invalid datatypes now:
>> b: {1 2 3 'x' ,}
== "1 2 3 'x' ,"
>> while [v: load/next b not empty? second v][probe v b: v/2]
[1 " 2 3 'x' ,"]
[2 " 3 'x' ,"]
[3 " 'x' ,"]
['x' " ,"]
** Syntax Error: Invalid word -- ,
** Near: (line 1) ,

Just add some hadler to convert the invalid datatype to something 
else what is loadable and then parse as a block
But such a preloader will slow down:(
I would like to know if string based parsing witch would handle all 
curent rebol datatypes can be faster or same fast as block parsing
Geomol
25-Sep-2006
[1421]
Gabriele, yes it works with strings. But I have words! Thing is, 
I parse the string input from the user and produce words in an internal 
format. Then I parse those words for the final output, which can 
be different formats. I would expect parse/case to be case-sensitive, 
when parsing words, but parse/case is only for strings, therefore 
my suggestion.
Gabriele
25-Sep-2006
[1422]
what i'd suggest is - if case is important, don't make them into 
words :)
Geomol
25-Sep-2006
[1423]
:D But it makes so much sense to work with words.
Gabriele
26-Sep-2006
[1424]
sure, but you can only have 8k or them (unless you make sure they 
never end up in system/words), so if you also counted case...
Maxim
26-Sep-2006
[1425]
another way to counter the word limit is to use #issue datatype.