r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3]

Andreas
4-Mar-2010
[1216x5]
Wherever "non-local code" is used in those cases, adjustments in 
binding of the non-local code will _already_ be made (i.e. for non-local 
code to be able to use the function arguments or the loop variable 
as if it was local code). This binding manipulation will have to 
be extended to also cater for the binding of control transfer functions.
But this may certainly make it too big a semantical change even for 
R3.
---
This said, and leaving the discussion of improved semantics aside, 
I think that the implementation of improved error causation as described 
by Brian in bug#1506 would be very worthwile and should be pursued.
Carl or anyone else with access to the R3 source is certainly the 
final authority on implementation overhead, but I'd like to add to 
the speculations: I think improved error causation should be possible 
with only adding minimal overhead, mostly to the non-local exit functions 
themselves.


If we have a task-local error handler stack (of setjmp/longjmp buffers, 
for example) it's mostly a matter of adding a few bits to those handlers 
describing wether they are valid targets for each class of non-local 
exit. Those bits must be correctly propagated/manipulated in functions 
which add their own handlers (the bit manipulation adds only negligible 
overhead). The non-local exit functions themselves check those bits, 
raising different errors if the use of the non-local exit function 
was invalid (as described in detail by Brian in the bug report). 
Those checks add a minimal overhead to each non-local exit function 
(RETURN, EXIT, BREAK, CONTINUE, THROW).
Paul
4-Mar-2010
[1221x4]
Thanks Cyphre!
Andreas, I still think that issue should be resolved by introducing 
another function since the cases where we would need such assurance 
is minimal in comparision to the overall application of attempt. 
 Why add more overhead to 98 precent of the cases when you can adapt 
to the 2 percent with another function or just implementing your 
own check and parsing.
I think dismissal is a good call on that one.
I'm referring to cc# 1506 btw.
Andreas
4-Mar-2010
[1225x2]
I don't think there would be any noticeable overhead.
I take it you are thinking about introducing an additional function 
similar to ATTEMPT that does what Sunanda desires in bug#1596?
BrianH
4-Mar-2010
[1227x5]
Andreas, I don't have a problem with that solution in principle. 
It's just that it wouldn't work, and wouldn't be task-safe. The handlers 
for those functions would be task-local, the code blocks not. Plus 
it would break code that uses code block references rather than nested 
blocks, code that uses those functions through function values, and 
any function with the [throw] attribute (which we will be getting 
back in R3 with different syntax), and all of those exist in R3 mezzanine 
code. Plus there's all the extra BIND/copy overhead added to every 
call to loop functions, startup code, etc., and don't think that 
you won't notice that because that can double the memory usage and 
executiion time, at least.


The solution I proposed in the ticket comments is to have DO, CATCH 
and the loops set a task-local flag in the interpreter state when 
the relevant functions become valid, and unset it when they become 
invalid, then have the functions check the flag at runtime before 
they do their work (which they could because they're all native). 
This would be task-safe, only add a byte of task-local memory overhead, 
plus the execution overhead of setting and getting bits in that byte 
in a task-local way. It's the execution overhead that we don't know 
about, whether it would be too much. It would certainly be less than 
your proposal though.
Carl is the authority on subtle implementation overhead, but for 
gross implementation overhead anyone can tell by just using the profiling 
tools and extraploating. And what you are proposing is definitely 
in the gross overhead category.
However, CATCH/name and THROW/name would need the additional memory 
overhead of a single block of words per task in the dynamic solution 
to store the currently handled names.
It might be hard to believe, but R3 has gotten so efficient that 
BIND/copy overhead is really noticeable now in comparison. In R2 
there were mezzanine loop functions like FORALL and FORSKIP that 
people often avoided using in favor of natives, even reorganizing 
their algorithms to allow using different loop functions like FOREACH 
or WHILE. Now that all loop functions in R3 are native speed, the 
FORALL and FORSKIP functions are preferred over FOREACH or FOR sometimes 
because FOREACH and FOR have BIND/copy overhead, and FORALL and FORSKIP 
don't. The functions without the BIND/copy overhead are much faster, 
particularly for small datasets and large amounts of code.
It's funny: While regular R3 code looks a lot like regular R2 code, 
optimized code looks a lot different because the balance of what 
is fast and what isn't has shifted. At least regular R3 code looks 
a lot more like optimized R3 code than regular R2 code looks like 
optimized R2 code. This is because we have been focusing on making 
the common, naive code patterns more optimized in R3, so that people 
don't have to do as much hand-optimization. The goal is to make it 
so that only writers of mezzanine and library code need to hand-optimize, 
and regular app developers can just use the optimized code without 
worrying about such things.
Andreas
4-Mar-2010
[1232x3]
Brian, please notice that I am talking about two things in the past 
few messages. I separated those discussions with "---".
The first is the proposal for a change of semantics, which I'm mainly 
interested in as a though experiment.
thought*
BrianH
4-Mar-2010
[1235]
Ah, cool. Glad to continue the thought experiment then :)
Andreas
4-Mar-2010
[1236x4]
Great :)
But actually I wanted to leave that experiment for now.
After the "---" I discussed the overhead of the solution you proposed 
on the bug tracker.
And if you re-read that, you will notice that it's precisely what 
you later describe.
BrianH
4-Mar-2010
[1240x2]
Yup.
Except the THROW/name block-of-words thing.
Andreas
4-Mar-2010
[1242x3]
Precisely.
The overhead of which would be more noticeable, but not too severe. 
Some simple heuristics should do fine.
As words are interned anyway, you only need an array of integers 
to store the names.
BrianH
4-Mar-2010
[1245]
Btw, non-local code blocks are a common optimization trick in mezzanine 
code, one which shows up a lot in Carl's code. It's probably the 
reason why REBOL supports the concept in the first place. And I've 
written code in REPLACE that uses the BREAK function as a function 
value, though I haven't checked whether other people use this trick 
:)
Andreas
4-Mar-2010
[1246x2]
Let's finish the performance discussion firs t:)
Typical code will have only very few distinct named catches.
BrianH
4-Mar-2010
[1248]
That's for sure - I haven't seen it yet in mezzanine code.
Andreas
4-Mar-2010
[1249]
So I think the vast majority of cases can be handled by very efficient 
code.
BrianH
4-Mar-2010
[1250]
It seems so. I've asked Carl to look at those tickets and chime in, 
so we'll see what he thinks.
Andreas
4-Mar-2010
[1251x2]
Great, I'd really like to see this improved, even if it's only a 
rare corner case.
That said, we can go back to the thought experiment, if you like 
:)
BrianH
4-Mar-2010
[1253]
If you want to see how weird really optimized R3 code can get, take 
a look at the source of LOAD and IMPORT - they are probably the most 
heavily optimized mezzanine functions. For the most part the rest 
of the mezzanine code is written for readability and maintainability, 
and the language optimized to make readable code fast. It's a good 
tradeoff :)
Andreas
4-Mar-2010
[1254]
I'm well aware of the value of foreign code blocks as such. The interesting 
question, I guess, is how often foreign code is used without re-binding 
it.
BrianH
4-Mar-2010
[1255]
Most of the time, actually, otherwise the BIND/copy overhead would 
make it a poor optimization.
Andreas
4-Mar-2010
[1256x4]
Optimization is only one use case, though.
Do you have a succinct example of such a use for optimization purposes?
The nice BREAK trick used in your REPLACE would mostly be unaffected 
by this change, for example.
You'd just use the function value of the (not globally bound) function 
implementing break.
BrianH
4-Mar-2010
[1260x4]
IMPORT uses code blocks as a way of reusing duplicate code, though 
it might not be affected either. And REPLACE would be affected because 
'break wouldn't be bound at the point it is used: Being in a function 
isn't enough, it's outside of the loop. BREAK is used to break out 
of loops, not functions.
That means that the BIND/copy overhead for BREAK and CONTINUE would 
happen at every call to a loop function, not just FOR, FOREACH and 
REPEAT. And 'break and 'continue would become keywords rather than 
function names, unable to be used for loop-local variables.
LOOP, WHILE, FORALL and FORSKIP don't currently have BIND/copy overhead. 
Which is why they are used a lot in R3 :)
Sorry, I don't mean to go on about that.
Andreas
4-Mar-2010
[1264x2]
Huh?
I certainly enjoyed the discussion, then :)