r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3 Proposals] For discussion of feature proposals

Andreas
11-Nov-2010
[276x2]
No problem with that.
Only that parse does not have any bearing on BREAK for loops.
BrianH
11-Nov-2010
[278x2]
And I can use those same proofs to apply to other algorithms with 
similar characteristics, and *know* that you gain some abilities 
with definitional scope, and lose others. This is why I know that 
Ladislav's DO-ALL is a loop, and so not wanting BREAK to apply to 
it is more of an opinion than something inherent in its nature. But 
that doesn't mean that the need for that is less.
And yes, I would be satisfied with THROW being dynamic and the rest 
not. But my *bare minimum* requirement for accepting that is to fix 
THROW so it actually works properly, and in many ways it doesn't 
at the moment (all with tickets), and in one way it could work better 
(also with a ticket).
Maxim
11-Nov-2010
[280x2]
I've just reread a few things and I now truely understand the deeper 
intricasies of this whole discussion... (finally ;-)
I think that the terms dynamic and definitional aren't making comprehension 
easy, especially dynamic.
BrianH
11-Nov-2010
[282x3]
(phone call)
That is why I prefer lexical instead of definitional. Definitional 
is lexical + faked lexical.
(as terms, not as concepts)
Maxim
11-Nov-2010
[285x4]
also I think that the word unwind from the error document should 
be used, since that is really is what happends afaict.
used in discussion I mean.
I also prefer lexical, though definitional is more *precise*... ironically, 
I didn't understand dynamic return until I grasped what definitional 
return really was.
in my mind they are *both* dynamic returns... you aren't just falling 
of the function's end, removing from the stack.  

you are *causing* that to happen.

the difference is in the how far they unwind...
BrianH
11-Nov-2010
[289]
Please don't take my mentioning of downsides as being a statement 
of opinion or some kind of taking sides. I only mention them because 
they are real, and must be considered when picking a certain strategy. 
Both approaches have plusses and minuses. If you want to make a rational 
choice then you need to know the issues - otherwise you are just 
being a fanboy.


For instance, I picked the definitional side for returns, without 
the need for a fallback to dynamic, because of a rational evaluation 
of the algorithmic style of R3's functions. And it wasn't until I 
remembered that the tasking issues had already removed the advantages 
that dynamic scoping has over lexical scoping - we just can't do 
that stuff as much anymore, so it doesn't matter if we don't try. 
The same goes for loops, but to a lesser extent - loops aren't affected 
as much by tasking issues so we can still do code that would benefit 
from dynamic breaks, but it still might be a worthy tradeoff to avoid 
needing an option (since we have no such option). But for THROW, 
especially THROW/name, there are things that you can do with dynamic 
throw that you *can't* do with definitional, and those things would 
have great value, so it's a rational choice to make the tradeoff 
in favor of dynamic.
Maxim
11-Nov-2010
[290]
can you explain to me how dynamic throw wouldn't be affected by tasking 
when dynamic return would?
BrianH
11-Nov-2010
[291]
Well it comes down to this: Functions are defined lexically. Though 
they are called dynamically, they aren't called until after they 
have already been bound, definitionally. But as a side effect of 
tasking, those bindings are stack-relative, and those stacks are 
task-local. But random blocks of code outside of functions are bound 
to object contexts, and those are *not* task-local. So that means 
that the old R2 practice of calling shared blocks of code is a really 
bad idea in R3 if any words are modified, unless there is some kind 
of locking or synchronization. This means that those blocks need 
to be moved into functions if their code is meant to be sharable, 
which means that at least as far as RETURN and EXIT are concerned, 
they can be considered lexically scoped. The advantage that we would 
get from being able to call a shared block of code and explicitly 
return in that block is moot, because we can't really do that much 
anymore. This means that we don't lose anything by switching to definitional 
code that we haven't already lost for other reasons. At least as 
far as functions are concerned, all task-safe code is definitional.


Loops are also defined lexically, more or less, and the rebinding 
ones are also task-safe because they are BIND/copy'd to a selfless 
object context that is only used for that one call and thrown away 
afterwards. And most calls to loops are task-safe anyways because 
they are contained in functions. However, the LOOP, FORALL, FORSKIP 
and WHILE loops do not rebind at the moment. We actually prefer to 
use those particular loops sometimes in R3 code because they can 
be more efficient than *EACH and REPEAT, because they don't have 
that BIND/copy overhead. Other times we prefer to use *EACH or REPEAT, 
in case particular loop fits better, or has high-enough repetitions 
and enough word references that the 27% overhead for stack-local 
word reference is enough to be more than the once-per-loop BIND/copy 
overhead. Since you don't have to move blocks into *loops* to make 
them task-safe, you can use blocks referred to by word to hold code 
that would be shared between different bits of code in the same function. 
This is called manual common subexpression elimination (CSE), and 
is a common optimization trick in advanced REBOL code, because we 
have to hand-optimize REBOL using tricks that the compiler would 
do for us if we were using a compiled language. Also, PARSE rules 
are often called from loops, and they are frequently (and in specific 
cases necessarily) referred to by word instead of lexically nested; 
most of the time these rules can be quite large, maximizing BIND/copy 
overhead, so you definitely don't want to put the extensive ones 
in a FOREACH or a closure.

Switching to definitional break would have three real downsides:

* Every loop would need to BIND/copy, every time the loop is called, 
including the loops that we were explicitly using because they *don't* 
BIND/copy.

* Code that is not nested in the main loop block would not be able 
to break from that loop. And code that is nested in the main loop 
would BIND/copy.

* We can in theory catch unwinds, run some recovery code, and send 
them on their way (hopefully only in native code, see #1521). Definitional 
escapes might be hard or impossible to catch in this way, depending 
on how they are implemented, and that would mean that you couldn't 
recover from breaks anymore.


The upside to definitional break would be that you could skip past 
a loop or two if you wanted to, something you currently can't do. 
Another way to accomplish that would be to add /name options to all 
the loop functions, and that wouldn't have the BIND/copy overhead. 
Or to use THROW or THROW/name.


The situation with THROW is similar to that of the non-binding loops, 
but more so, still task-safe because of functions. But CATCH and 
THROW are typically the most useful in two scenarios:

* Escaping through a lot of levels that would catch dynamic breaks 
or returns.

* Premade custom escape functions that might need to enforce specific 
semantics.


Both of these uses can cause a great deal of difficulty if we switched 
to definitional throw. In the first case, the code is often either 
broken into different functions (and thus not nested), or all dumped 
into a very large set of nested code that we wouldn't want to BIND/copy. 
Remember, the more levels we want to throw past, the more code that 
goes into implementing those levels. In the second case definitional 
throw would usually not work at all because the CATCH and the THROW 
would contained in different functions, and the code that calls the 
function wrapping the THROW would not be nested inside the CATCH. 
So you would either need to rebind every bit of code that called 
the THROW, or the definitional THROW would need to be passed to the 
code that wants to call it like a continuation (similar concept). 
Either way would be really awkward.


On the plus side of dynamic (whatever), at least it's easy to catch 
an unwind for debugging, testing or recovery purposes. For that matter, 
the main advantage of using THROW/name as the basic operation that 
developers can use to make custom dynamic escape functions is that 
we can build in a standard way to catch it and that will work for 
every custom escape that is built with it. The end to the arms race 
of break-through and catch.
Maxim
11-Nov-2010
[292]
wow... I was wondering why it took you so long to reply  ;-)
BrianH
11-Nov-2010
[293x3]
Yeah :) And I had another phone call.
#1521 is a critical issue btw. We use that facility in DO, for instance.
Yes, that 27% is a measured number, not made up. Not measured recently, 
but there haven't been any changes in R3 since then that would affect 
it.
Maxim
11-Nov-2010
[296]
thanks, all makes sense.
BrianH
11-Nov-2010
[297x2]
That is why I am in favor of definitional return, extremely skeptical 
of definitional break, and definitely opposed to definitional throw.
Wouldn't it be great if it was just a matter of opinion?
Maxim
11-Nov-2010
[299]
one thing I don't clearly understand in the above....


* Code that is not nested in the main loop block would not be able 
to break from that loop
BrianH
11-Nov-2010
[300x2]
Definitional (whatever) depends on a BIND to do its work, deep, usually 
BIND/copy. And that only works on words that are physically in the 
blocks that you bind, or in blocks that are nested in those blocks, 
etc. Another block that is outside the block you are binding and 
referred to by name won't be bound. That is the limit of the definitional 
approach.
Note that the definitional BIND that functions do when created is 
*not* a BIND/copy, it modifies. Same thing with closures when they 
are created, though they also do something like a BIND/copy every 
time they are called.
Maxim
11-Nov-2010
[302]
oh yes, definitional * is complicated with references... hadn't realized 
that.
BrianH
11-Nov-2010
[303x3]
Ladislav, your definitional throw in the "Definitional CATCH/THROW 
mezzanine pair" section of your page isn't recursion-safe, because 
MAKE function! does a modifying BIND rahter than a non-modifying 
BIND/copy. Otherwise, nice work :)
It's not task-safe either, but recursion-safety is more of an issue 
for now.
That's the one under "The state of R2 (dynamic return with optional 
transparency)".
Maxim
11-Nov-2010
[306]
btw, for throw/catch, I agree 100%, even after I now, fully understanding 
the topic. 
   

If we lost dynamic throws, trying to make it work *as* a dynamic 
system is not pragmatic and AFAIK prone to many strange problems, 
especially if we try to create our own code patterns and would need 
to decipher cryptic mezzanine code which does some magic.


the way I see it, definitional throw/catch really doesn't scale well 
and doesn't work especially well in collaborative programming if 
there are multiple trap points with different catch setups.


I can see ways this can be a problem, especiallly when code IS NOT 
bound on purpose like in liquid which uses a class-based model, *specifically* 
because it allows me to scale a system by at least three orders of 
magnitude. 


liquid builds nodes on the fly and generally re-organizes processing 
on the fly.  one system might be building the setup, while another, 
later will execute it.  with definitional throw, this is impossible 
to make work.
BrianH
11-Nov-2010
[307x2]
Definitional * has one advantage over dynamic: You can see it in 
the source. When the program runs the scope is actually dynamic, 
but you have to use your imagination or a debugger to see it.
Not only one advantage, but that is the most significant advantage 
for most programmers.
Ladislav
12-Nov-2010
[309]
Brian wrote: "BREAK also applies to PARSE, which relies on dynamic 
scope (and yes, I can point you to mathematical proofs of this, though 
there's no point)" - I *must* correct this! Parse break is:

- neither dynamic
- nor definitional

it is a third kind:

parse break is lexical

Here is why:


1) It is stated in the documentation, that "parse break is a keyword", 
i.e. it it lexically defined to be a keyword of the dialect

2) it is stated in the documentation, that it "breaks out from the 
nearest loop", which is true, but it the lexical sense again
Pekr
12-Nov-2010
[310]
So very novice question - parse [break] is not the same as parse 
[(break)] internally? :-)
BrianH
12-Nov-2010
[311x3]
Sorry, I meant PARSE [(break)]. BREAK in parens is completely different 
than BREAK in the rules :)
So the correction was not to the statement that I made, it was to 
which BREAK I was referring.
Nonetheless, for PARSE's BREAK operation, "breaks out from the nearest 
loop" means in dynamic scope, not lexical.
Ladislav
12-Nov-2010
[314x3]
Any case you find?
(you cannot)
parse is keyword-based
BrianH
12-Nov-2010
[317]
Yes, but lexical scope has nothing to do with lexical keywords.
Maxim
12-Nov-2010
[318]
AFAICT its not lexical since it will properly return to any rule 
which uses a referenced sub rule via a world as well as a sub-block
Ladislav
12-Nov-2010
[319x3]
Not to mention, that (break) is not a parse construct, it is actually 
foreign to parse
No example what so ever are you able to find
Any non-lexical behaviour?
BrianH
12-Nov-2010
[322x2]
Except (break) and (break/return) were explicitly added to what PARSE 
supports. It was one of the better Parse Proposals, and it was accepted 
and implemented.
Lexical scope means nested blocks. The blocks don't have to be nested, 
as Maxim said.
Ladislav
12-Nov-2010
[324x2]
The argument, that for parse (break) has to be dynamic does not hold 
any water. Why?
As already mentioned, it is foreign to parse