r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[All] except covered in other channels

Dockimbel
10-Jan-2009
[3343]
Right, I misunderstood LOAD/else proposal, I thought you could supply 
your own parsing rules to be used from the point where LOAD fails. 
But I still can't see the point of using LOAD on non-REBOL data, 
instead of string parsing or LOAD/next. LOADing a whole book would 
most probably blow out REBOL words limit. If it all boils down to 
splitting untrusted input into whitespace separated tokens, why no 
just use PARSE data "" ? Is there a real benefit for the programmer 
of having LOAD not stop on syntax errors?
btiffin
10-Jan-2009
[3344]
Doc; no not really a benefit to programmers and rebol-by-nature types; 
 The goal is non-rebol types and creating ease of use that attracts 
those types to become rebol-by-nature.  But I've sure wasted a lot 
of good rebol time already, so ... 

Go Doc Go and Do Gabriele Do    Ok ... and Yay Brian!    ;)
BrianH
10-Jan-2009
[3345]
Pekr, I have already proposed the TRANSCODE change, and I can start 
the implementation of LOAD/else asap (which will likely be after 
the release). LOAD/else will work without the TRANSCODE fix, but 
will work better with the fix.
Henrik
28-Mar-2009
[3346]
About VPS: Looks like it could be Linode for me. Slicehost is pretty 
on the surface, but it appears that customers are a bit happier with 
Linode. Also Slicehost was recently purchased by Rackspace, and some 
customers reported a lower service quality after that. That might 
be a bad sign.
Maxim
28-Mar-2009
[3347x2]
doc, its faster to let rebol do what is does best in binary and let 
us take over on those tokens it can't recognise.  just ignoring comma 
would have allowed me use scientific data more than once directly 
 without having to dare understand parse intricacies.


remember that parse IS NOT EASY.  I didn't use it for over 6 years, 
cause everytime I dared, it just blew up in my face.  this would 
provide a simple entry point for more people to support DSL and leverage 
the rich datatype system in rebol, without the need to be a guru 
level reboler.
the point is not to replace REBOL's syntax, but allow REBOL to cope 
better with the rest of the world's data, IMHO.
Dockimbel
28-Mar-2009
[3349]
IMO, designing a good DSL is far more difficult than writing parse 
rules.


I think that ppl here are underestimating the complexity of implementing 
a jump-over-foreign-data feature that would work in the general case. 
REBOL syntax is not based only on whitespaces, but on delimiters 
too : double-quotes, curly and square brackets, parenthesis,.... 


So it's perfectly valid to not use whitespaces at some places, like 
in : "either conditon [true][false]". So what should LOAD do in case 
of, e.g. : "either condition [f,o,r,e,i,g,n,0,1][false]" ? IMHO, 
besides reporting a syntax error at "f," , there's no much point 
returning [either condition "[f,o,r,e,i,g,n,0,1][false]"]. And if 
you think that LOAD could just return [either condition ["f,o,r,e,i,g,n,0,1"][false]], 
then you just jumped over the complexity by creating some syntax 
rules in your mind, but LOAD can't do that. Once LOAD has passed 
the syntax error point, it has no sure way to determine where the 
foreign data ends and where REBOL correct syntax starts again. That 
would require at least, an AI engine (like Gabriele stated above). 
I'm not even sure that all possible cases could be covered that way.


Btw, does such feature exists in any other programming language? 
I've personally never seen or read about such feature elsewhere, 
maybe for a good reason.
Steeve
28-Mar-2009
[3350]
load-foreign works well to me in R3.

alpha: charset [#"a" - #"z" #"A" - #"Z"]
chars: union alpha charset [#"0" - #"9" "-_"]
word: [alpha any chars]


probe load-foreign "either condition [f,o,r,e,i,g,n,0,1][false;]" 
[
		;**keep words with comma as trings
	[word some [#"," any chars]]		[keep &]	
		;**remove ";" at the end of words
	[word #";"]				[keep to-word head remove back tail &]	
]

>> [either condition ["f,o,r,e,i,g,n,0,1"] [false]]
btiffin
28-Mar-2009
[3351x2]
I still vote for foreign!  as a datatype.  Sure there will still 
be reasons for throwing errors on evaluation on foreign! but I don't 
like the fact that the errors occur on load.

Give linguists a tool that out-of-the-box can load a poem and let 
them count characters, words, and transitions.


Let REBOL pros worry about the hassles of evaluation.   Re: where 
the foreign data ends; Skip AI, just go with delimiters.  Whitespace 
or quotes or any of the braces etc.  And if a missing quote causes 
a script to fail ... well bad on the coder for not testing before 
release.  If the English professor gets wonky counts, well let them 
figure it out, but give them the wonky counts not a LOAD failure. 
 IMHO it really would open the door to so many more uses and users. 
 


And with Steeve's example ... build something like that into the 
product.  Don't make poor Professor Keating try and figure out why 
most of the Dead Poet's prose can't be analysed with REBOL's nifty 
cool block data functions.


And as long as REBOL had foreign! as a native datatype, writing a 
LINT would be dirt easy.
And I could LOAD %cobol.cob  and do all kinds of super cool preprocessing. 
 Or at least wait for Sunanda to write a super cool preprocessor 
and live vicariously through rebol of the year candidates.  ;)


REBOL could be the JavaDoc, EpyDoc, Doxygen, ..., ReST besting documentation 
super tool of tomorrow.
Maxim
29-Mar-2009
[3353x2]
doc, the point is that I and others HAVE had ,many situations where 
this would have saved HOURS even given new reasons to use rebol at 
some jobs.  the point is not to allow new rebol dialects, its just 
that in MANY cases it would work..  we all know that this isn't a 
replacement for parse.  but allowing me to import external data through 
load is just faster than parsing.  You still have to tailor the extension 
to the dataset anyways.
We could load things like C code directly, and then parse it using 
block dialecting instead. way faster.
btiffin
29-Mar-2009
[3355]
Ditto Maxim.   Adding ...  I don't look at this as being a overly 
good thing for rebols (although I do think we'd all benefit, save 
time and write a whole new breed of utilities), I think of it as 
a good thing for non-rebols (and us rebols with smaller brains and 
bigger laze).
Pekr
29-Mar-2009
[3356]
ah, so we are at it once again ....
Gabriele
29-Mar-2009
[3357x3]
You guys write it, then we can talk about it.
it's incredible you think that "parse is not easy" and then you don't 
trust people telling you that what you ask for is silly at best.
parse *IS* easy. try to use regexps instead!
Steeve
29-Mar-2009
[3360x3]
ok i give you load-foreign, don't kick me Gabriele :-)
context [
	stack: make block! 10
	foreign-rules: make block! 10
	out: end: value: pos: none
	push: func [type][
		stack: change/only stack out 
		out: make type 1 
	]
	pop:  to-paren [
		stack: back stack 
		out: append/only stack/1 out
	]
	set-&: [end: (misc/&: to string! copy/part pos end)]
	misc: context [
		&: none
		keep: func [&][append out &]
	]
	blanks: charset " ^-^M"
	rules: [
		  some [blanks | #"^/" (new-line out true)]	  
		| pos: foreign-rules
		| [#"]" | #")"] (print "missing [ or (" return stack/1)
		| #"[" (push block!) any [#"]" pop break | rules]
		| #"(" (push paren!) any [#")" pop break | rules]

  | (set [value pos] transcode/next/error pos append out value) :pos
	]
	
	set 'load-foreign func [
		text [string!] foreign [block!]
	][
		out: make block! 10
		stack: change/only clear head stack out
		clear foreign-rules
		foreach [rule code] foreign [

   append foreign-rules compose [(rule) set-& (to-paren bind code misc) 
   |]
		]
		append foreign-rules  [end skip]
		parse/all to-binary text [any rules]
		bind out 'system
		either 1 = length? out [first out][out]
	]
]
]
Dockimbel
29-Mar-2009
[3363]
Steeve, you're just proving my and Gabriele's point. There's no way 
to implement a LOAD-FOREIGN that would work for any data. You have 
to provide parsing rules that are specific for the input data. Max 
and others are asking for a LOAD-FOREIGN that just works on any input 
data without providing any additionnal rules.
Ammon
29-Mar-2009
[3364]
and if you have to provide the additional rules you might as well 
parse it from scratch.
Steeve
29-Mar-2009
[3365x5]
i can generate strings by default each time there are errors by default, 
so that there is no need to provide rules.
It's a tiny modification
But strings are not good types for those  it would be better to simply 
add errors as objects in the loaded stream.

But after that you steel need to reparse the block to convert the 
errors in proper values.
SO, in the end, y
*steal
Ok, in the following version (much simpler) i just add foreign data 
as error objects in the stream.
Then you can reparse the result to manage errors as you want.

context [
	stack: make block! 10
	out: value: pos: none
	push: func [type][
		stack: change/only stack out 
		out: make type 1 
	]
	pop:  to-paren [
		stack: back stack 
		out: append/only stack/1 out
	]
	blanks: charset " ^-^M"
	rules: [
		  some [blanks | #"^/" (new-line out true)]	  
		| [#"]" | #")"] (print "missing [ or (" return stack/1)
		| #"[" (push block!) any [#"]" pop break | rules]
		| #"(" (push paren!) any [#")" pop break | rules]

  | pos: (set [value pos] transcode/next/error pos append out value) 
  :pos
	]
	set 'load-foreign func [
		text [string!] 
	][
		out: make block! 10
		stack: change/only clear head stack out
		parse/all to-binary text [any rules]
		bind out 'system
		either 1 = length? out [first out][out]
	]
]
probe load-foreign {
	either condition [
		f,o,r,e,i,g,n,0,1
	][
		false
	]
} 

>>[
    either condition [
        make error! [
            code: 200
            type: 'Syntax
            id: 'invalid
            arg1: "word"
            arg2: "f,o,r,e,i,g,n,0,1"
            arg3: none
            near: "(line 1) f,o,r,e,i,g,n,0,1"
            where: [transcode parse load-foreign ...]
        ]
    ] [
        false
    ]
]
Dockimbel
29-Mar-2009
[3370]
>> load-foreign "either condition ]f,o,r,e,i,g,n,0,1[]false["
missing [ or (
== none


Not very usefull output. Remember that we are talking here of the 
general case where input can be *any* data.
Steeve
29-Mar-2009
[3371x2]
i gave the design, it's simple to ehance
what is good to return in that case ?
Dockimbel
29-Mar-2009
[3373]
It depends on the rules chosen to interpret the string content. That's 
the point, there's no simple way to implement a general case load-foreign 
that would return a block! of mixed REBOL values and "foreign" values 
as string. It's not possible to define a general set of rules because 
"foreign" can be just anything. LOAD just returns an error! when 
the syntax is incorrect and that's the best thing it can do.
Maxim
30-Mar-2009
[3374x3]
doc, never said it should accept anything...
just that I have seen it would work so many times as to not make 
it a useless feature.
its also a question of speed.  load is blighthingly fast.
Dockimbel
30-Mar-2009
[3377]
If you can define a set of rules for your specific needs, then you 
can just use parse or Steeve's load-foreign for that (or ask someone 
to write it for you if you don't want to implement it). Everyone 
can have its own specific needs, I can't see how to define a set 
of rules for "foreign" data that would be or could be useful to most 
users. That would probably end up defining a few additionnal datatypes 
(or alternative syntax for existing datatypes).
Gabriele
31-Mar-2009
[3378x2]
Max, did you ever consider WHY load is so fast?
What makes you think that it would be so fast if it was able to do 
what you ask?
btiffin
31-Mar-2009
[3380]
Because.  ;)   If REBOL had a foreign! datatype! could the lexer 
not just ... about to throw syntax error, make a foreign! and continue?


I still don't see why this is deemed a bad extension to REBOL.   
Sure it "stinks" to guru's but it opens doors to fishermen, contractors 
and linguists, imho.  And yes it could look like a load of gunk if 
a quote, paren, brace or bracket was misaligned.  So?


For code purity, I'm sure Steeve could spend a few minutes to write 
4 lines to scan a loaded series! and recursively highlight the foreign! 
data, providing a semblance of a lint procedure for those writing 
apps.  My push for foreign! has little to do with apps, and just 
about all to do with data, human real-world data.   For people using 
data for sums, control, etc... just use the lint scan to reject data 
blocks that have ANY foreign! substances??
Izkata
31-Mar-2009
[3381x2]
Why not just load the bad data as a string! ?
should be simple to parse out string!s when a numeric datatype is 
expected...
Anton
31-Mar-2009
[3383]
Are you guys listening?
Steeve
31-Mar-2009
[3384]
yep
Anton
31-Mar-2009
[3385]
I'm just a bit amazed at the lack of comprehension.  I agree with 
DocKimbel and I advise to think on his words longer.
Steeve
31-Mar-2009
[3386]
what !?
Pekr
31-Mar-2009
[3387]
I have never met with anything I could not handle. Gee, we are complaining 
about REBOL parser not being able to handle ANY format, while even 
lamers as me are being able to use 'parse in opposition to regexp? 
Why this possession? What is exact deal breaker? Isn't it a bit naive? 
Brian, really - what fishermen are you talking about? For anything 
more complicated than one-liner, you have to come up with script 
anyway. And if you store code into script, you can write few parse 
rules, no? Well, maybe it is me, who never uses load (I don't like 
the fact it mysteriosly does decoding of few things here or there 
- libraries, jpg, which is fixed in R3), but with REBOL string parsing, 
you can do many things. I don't even agree to Max opinion, that parse 
will let you down. Well, if you want to parse streamed mp3 binary 
content, maybe so, but for some general data format, specifically 
delimited? Come on :-)
Steeve
31-Mar-2009
[3388x2]
Parse is my beloved function in Rebol, i use it all the tuime.
*time
Pekr
31-Mar-2009
[3390]
btw - there is now new blog posted - encoders/decoders. We should 
better care to get those things done right. You can write your decode-anything 
codec to input your mysterious data :-)
Steeve
31-Mar-2009
[3391x2]
eh ?
(i don't understand)