r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3 Proposals] For discussion of feature proposals

Maxim
11-Nov-2010
[290]
can you explain to me how dynamic throw wouldn't be affected by tasking 
when dynamic return would?
BrianH
11-Nov-2010
[291]
Well it comes down to this: Functions are defined lexically. Though 
they are called dynamically, they aren't called until after they 
have already been bound, definitionally. But as a side effect of 
tasking, those bindings are stack-relative, and those stacks are 
task-local. But random blocks of code outside of functions are bound 
to object contexts, and those are *not* task-local. So that means 
that the old R2 practice of calling shared blocks of code is a really 
bad idea in R3 if any words are modified, unless there is some kind 
of locking or synchronization. This means that those blocks need 
to be moved into functions if their code is meant to be sharable, 
which means that at least as far as RETURN and EXIT are concerned, 
they can be considered lexically scoped. The advantage that we would 
get from being able to call a shared block of code and explicitly 
return in that block is moot, because we can't really do that much 
anymore. This means that we don't lose anything by switching to definitional 
code that we haven't already lost for other reasons. At least as 
far as functions are concerned, all task-safe code is definitional.


Loops are also defined lexically, more or less, and the rebinding 
ones are also task-safe because they are BIND/copy'd to a selfless 
object context that is only used for that one call and thrown away 
afterwards. And most calls to loops are task-safe anyways because 
they are contained in functions. However, the LOOP, FORALL, FORSKIP 
and WHILE loops do not rebind at the moment. We actually prefer to 
use those particular loops sometimes in R3 code because they can 
be more efficient than *EACH and REPEAT, because they don't have 
that BIND/copy overhead. Other times we prefer to use *EACH or REPEAT, 
in case particular loop fits better, or has high-enough repetitions 
and enough word references that the 27% overhead for stack-local 
word reference is enough to be more than the once-per-loop BIND/copy 
overhead. Since you don't have to move blocks into *loops* to make 
them task-safe, you can use blocks referred to by word to hold code 
that would be shared between different bits of code in the same function. 
This is called manual common subexpression elimination (CSE), and 
is a common optimization trick in advanced REBOL code, because we 
have to hand-optimize REBOL using tricks that the compiler would 
do for us if we were using a compiled language. Also, PARSE rules 
are often called from loops, and they are frequently (and in specific 
cases necessarily) referred to by word instead of lexically nested; 
most of the time these rules can be quite large, maximizing BIND/copy 
overhead, so you definitely don't want to put the extensive ones 
in a FOREACH or a closure.

Switching to definitional break would have three real downsides:

* Every loop would need to BIND/copy, every time the loop is called, 
including the loops that we were explicitly using because they *don't* 
BIND/copy.

* Code that is not nested in the main loop block would not be able 
to break from that loop. And code that is nested in the main loop 
would BIND/copy.

* We can in theory catch unwinds, run some recovery code, and send 
them on their way (hopefully only in native code, see #1521). Definitional 
escapes might be hard or impossible to catch in this way, depending 
on how they are implemented, and that would mean that you couldn't 
recover from breaks anymore.


The upside to definitional break would be that you could skip past 
a loop or two if you wanted to, something you currently can't do. 
Another way to accomplish that would be to add /name options to all 
the loop functions, and that wouldn't have the BIND/copy overhead. 
Or to use THROW or THROW/name.


The situation with THROW is similar to that of the non-binding loops, 
but more so, still task-safe because of functions. But CATCH and 
THROW are typically the most useful in two scenarios:

* Escaping through a lot of levels that would catch dynamic breaks 
or returns.

* Premade custom escape functions that might need to enforce specific 
semantics.


Both of these uses can cause a great deal of difficulty if we switched 
to definitional throw. In the first case, the code is often either 
broken into different functions (and thus not nested), or all dumped 
into a very large set of nested code that we wouldn't want to BIND/copy. 
Remember, the more levels we want to throw past, the more code that 
goes into implementing those levels. In the second case definitional 
throw would usually not work at all because the CATCH and the THROW 
would contained in different functions, and the code that calls the 
function wrapping the THROW would not be nested inside the CATCH. 
So you would either need to rebind every bit of code that called 
the THROW, or the definitional THROW would need to be passed to the 
code that wants to call it like a continuation (similar concept). 
Either way would be really awkward.


On the plus side of dynamic (whatever), at least it's easy to catch 
an unwind for debugging, testing or recovery purposes. For that matter, 
the main advantage of using THROW/name as the basic operation that 
developers can use to make custom dynamic escape functions is that 
we can build in a standard way to catch it and that will work for 
every custom escape that is built with it. The end to the arms race 
of break-through and catch.
Maxim
11-Nov-2010
[292]
wow... I was wondering why it took you so long to reply  ;-)
BrianH
11-Nov-2010
[293x3]
Yeah :) And I had another phone call.
#1521 is a critical issue btw. We use that facility in DO, for instance.
Yes, that 27% is a measured number, not made up. Not measured recently, 
but there haven't been any changes in R3 since then that would affect 
it.
Maxim
11-Nov-2010
[296]
thanks, all makes sense.
BrianH
11-Nov-2010
[297x2]
That is why I am in favor of definitional return, extremely skeptical 
of definitional break, and definitely opposed to definitional throw.
Wouldn't it be great if it was just a matter of opinion?
Maxim
11-Nov-2010
[299]
one thing I don't clearly understand in the above....


* Code that is not nested in the main loop block would not be able 
to break from that loop
BrianH
11-Nov-2010
[300x2]
Definitional (whatever) depends on a BIND to do its work, deep, usually 
BIND/copy. And that only works on words that are physically in the 
blocks that you bind, or in blocks that are nested in those blocks, 
etc. Another block that is outside the block you are binding and 
referred to by name won't be bound. That is the limit of the definitional 
approach.
Note that the definitional BIND that functions do when created is 
*not* a BIND/copy, it modifies. Same thing with closures when they 
are created, though they also do something like a BIND/copy every 
time they are called.
Maxim
11-Nov-2010
[302]
oh yes, definitional * is complicated with references... hadn't realized 
that.
BrianH
11-Nov-2010
[303x3]
Ladislav, your definitional throw in the "Definitional CATCH/THROW 
mezzanine pair" section of your page isn't recursion-safe, because 
MAKE function! does a modifying BIND rahter than a non-modifying 
BIND/copy. Otherwise, nice work :)
It's not task-safe either, but recursion-safety is more of an issue 
for now.
That's the one under "The state of R2 (dynamic return with optional 
transparency)".
Maxim
11-Nov-2010
[306]
btw, for throw/catch, I agree 100%, even after I now, fully understanding 
the topic. 
   

If we lost dynamic throws, trying to make it work *as* a dynamic 
system is not pragmatic and AFAIK prone to many strange problems, 
especially if we try to create our own code patterns and would need 
to decipher cryptic mezzanine code which does some magic.


the way I see it, definitional throw/catch really doesn't scale well 
and doesn't work especially well in collaborative programming if 
there are multiple trap points with different catch setups.


I can see ways this can be a problem, especiallly when code IS NOT 
bound on purpose like in liquid which uses a class-based model, *specifically* 
because it allows me to scale a system by at least three orders of 
magnitude. 


liquid builds nodes on the fly and generally re-organizes processing 
on the fly.  one system might be building the setup, while another, 
later will execute it.  with definitional throw, this is impossible 
to make work.
BrianH
11-Nov-2010
[307x2]
Definitional * has one advantage over dynamic: You can see it in 
the source. When the program runs the scope is actually dynamic, 
but you have to use your imagination or a debugger to see it.
Not only one advantage, but that is the most significant advantage 
for most programmers.
Ladislav
12-Nov-2010
[309]
Brian wrote: "BREAK also applies to PARSE, which relies on dynamic 
scope (and yes, I can point you to mathematical proofs of this, though 
there's no point)" - I *must* correct this! Parse break is:

- neither dynamic
- nor definitional

it is a third kind:

parse break is lexical

Here is why:


1) It is stated in the documentation, that "parse break is a keyword", 
i.e. it it lexically defined to be a keyword of the dialect

2) it is stated in the documentation, that it "breaks out from the 
nearest loop", which is true, but it the lexical sense again
Pekr
12-Nov-2010
[310]
So very novice question - parse [break] is not the same as parse 
[(break)] internally? :-)
BrianH
12-Nov-2010
[311x3]
Sorry, I meant PARSE [(break)]. BREAK in parens is completely different 
than BREAK in the rules :)
So the correction was not to the statement that I made, it was to 
which BREAK I was referring.
Nonetheless, for PARSE's BREAK operation, "breaks out from the nearest 
loop" means in dynamic scope, not lexical.
Ladislav
12-Nov-2010
[314x3]
Any case you find?
(you cannot)
parse is keyword-based
BrianH
12-Nov-2010
[317]
Yes, but lexical scope has nothing to do with lexical keywords.
Maxim
12-Nov-2010
[318]
AFAICT its not lexical since it will properly return to any rule 
which uses a referenced sub rule via a world as well as a sub-block
Ladislav
12-Nov-2010
[319x3]
Not to mention, that (break) is not a parse construct, it is actually 
foreign to parse
No example what so ever are you able to find
Any non-lexical behaviour?
BrianH
12-Nov-2010
[322x2]
Except (break) and (break/return) were explicitly added to what PARSE 
supports. It was one of the better Parse Proposals, and it was accepted 
and implemented.
Lexical scope means nested blocks. The blocks don't have to be nested, 
as Maxim said.
Ladislav
12-Nov-2010
[324x3]
The argument, that for parse (break) has to be dynamic does not hold 
any water. Why?
As already mentioned, it is foreign to parse
Blocks don't have to be nested

 - does it make any sense to you? It surely does not make any sense 
 to me.
BrianH
12-Nov-2010
[327x2]
Maybe you missed it, but there are better arguments against dynamic 
break. PARSE [(break)] is minor in comparison to the other problems.
Blocks don't have to be nested
  a: [] b: [a]
Not nested, dynamic scope.
Ladislav
12-Nov-2010
[329x2]
I do understand that "blocks don't have to be nested", but that does 
not relate to the fact, that break in parse behaves lexically
(and make no mistake, I mean the parse keyword, not the foreign (break) 
construct)
BrianH
12-Nov-2010
[331]
The BREAK keyword does not break out of the nearest loop lexically, 
it breaks out of the nearest in the (PARSE equivalent of the) call 
chain. It is dynamic in scope, which can easily be demonstrated with 
ANY, SOME or WHILE with a named rule with a BREAK in it, instead 
of an inline block.
Ladislav
12-Nov-2010
[332]
{very novice question - parse [break] is not the same as parse [(break)] 
internally?} - correct, they are two completely unrelated constructs, 
the latter being "foreing" to parse, related to the do dialect, in 
fact
BrianH
12-Nov-2010
[333]
If you call a rule through a name, you are using dynamic scope. If 
the rule is inline then it is lexical scope.
Ladislav
12-Nov-2010
[334x2]
Are you suggesting, that you cannot name things in lexically scoped 
constructs?
I simply don't understand, how it relates to the subject
BrianH
12-Nov-2010
[336]
You can not refer to structures by name in lexically scoped constructs, 
when those names are resolved at runtime. Well, you can, but then 
that becomes a dynamically scoped flow construct.
For instance:
a: [break]
b: [while a]


The word 'a is in the lexical scope of b, but the contents of a are 
in its dynamic scope only if b is used as a parse rule at runtime 
and a is still assigned that value. So even though the break is a 
keyword, the scope to which it breaks is the while, which is in b.
Ladislav
12-Nov-2010
[337]
... I know that Ladislav's DO-ALL is a loop, and so not wanting BREAK 
to apply to it is more of an opinion than something inherent in its 
nature.

 - I was afraid, that the DO-ALL was not a fortunate choice! My original 
 problem with the property illustrated by DO-ALL occurred when I Implemented 
 my PIF (now called CASE for the newcomers), which was not meant to 
 catch any breaks, as is immediately obvious.
BrianH
12-Nov-2010
[338x2]
Yup. Hence the "we need this" comment (paraphrased).
It was just a caveat, not a criticism.