r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3]

BrianH
3-Mar-2010
[1182]
CONTEXT does chained assignment too, btw.
Paul
3-Mar-2010
[1183]
Yes, so does almost everything in REBOL since day 1.  Surely your 
not thinking I just stumbled upon that discovery.
BrianH
3-Mar-2010
[1184]
Do you need to access the elements as if they are in an object, or 
elements that are actually in an object?
Paul
3-Mar-2010
[1185]
I'm just seeing room for improvement here to make REBOL even more 
safe.
BrianH
3-Mar-2010
[1186]
But there are no safety issues with chained assignment.
Paul
3-Mar-2010
[1187x5]
I could simply just use a block if I want to access elements but 
that isn't ideal for conversion to an object when I decide I want 
that flexibility.
Not so my friend.
That is where your wrong.
Ideally you would use a construct for assignments of values passed 
via cgi.
While those items are string values it could still potentially be 
exploited in my opinion.
BrianH
3-Mar-2010
[1192]
It sounds like you need a PARSE wrapper.
Paul
3-Mar-2010
[1193x2]
Or we just make sure that such assignment is not chained -       
       ;-)
bbl
BrianH
3-Mar-2010
[1195x4]
If the CGI data is parsed properly it should never generate anything 
other than pairs. If it ever generates chained assignments that is 
an error in your parse rules, not CONSTRUCT.
By "pairs" I mean name-value pairs, not pair! values.
On the other hand, you can APPEND to objects and maps in R3, so you 
don't need the block at all.
And appending to objects doesn't chain assignments.
Paul
3-Mar-2010
[1199x3]
I'll create my own function.
problem solved.
is this really a valid email?

 >> email? load "carl@.com"
== true
BrianH
3-Mar-2010
[1202]
It doesn't check whether the email address is valid, just that the 
syntax is valid.
Paul
3-Mar-2010
[1203]
I could suppose that syntax is valid if I worked my ISP was a root 
server.
Gregg
4-Mar-2010
[1204]
Paul, creating your own func is the way to go for that. And emails 
can be very complex. REBOL doesn't try to validate against the true 
grammar, it just looks for @.
Steeve
4-Mar-2010
[1205]
actually the smallest valid email in Rebol is 1 char + @
>> type? 1@
== email!
Gregg
4-Mar-2010
[1206]
:-)
Sunanda
4-Mar-2010
[1207]
cc#1509 -- Thanks Brian for consolidating a whole handful of bug 
reports [and/or misunderstandings].
That should help focus the core RT crew into clarifying / fixing.
Cyphre
4-Mar-2010
[1208]
Paul, I usually test email entry using simmilar code to this:
if any [
    not email? try [email: first load/next cgi/email]
    all [
		not read join dns:// email/host
		not read rejoin [dns:// "smtp." email/host]
		not read rejoin [dns:// "pop." email/host]
		not read rejoin [dns:// "mail." email/host]
		not read rejoin [dns:// "www." email/host]
		...
	] 
][
	print "email is not valid or your email server is unavailable"
]


Worked well in most cases for me but I bet others might have more 
sophisticated solutions.
BrianH
4-Mar-2010
[1209]
Thanks, Sunanda, never would have figured it out without your help.
Gabriele
4-Mar-2010
[1210]
Sunanda, you're still seeing this the wrong way. ATTEMPT never returns 
an error. the "error" you are talking about happens outside of ATTEMPT, 
so it can't guard against it.
Andreas
4-Mar-2010
[1211x10]
Yes, Gabriele, that's how things are currently implemented.
The question is, should the error occur inside the attempt instead?
I think that RETURN, EXIT and BREAK, CONTINUE should be only available 
in their respective contexts (functions, and loops). CATCH/THROW 
will need special treatment. QUIT, Q, HALT can be left as is.
I am well aware that this would be a significant change in semantics.
Brian claims that "most of the code we call in REBOL is non-local, 
meaning not directly nested in blocks". I on the other hand think 
that for the specific functions under consideration (RETURN, EXIT, 
BREAK, CONTINUE) this is mostly not the case.
Wherever "non-local code" is used in those cases, adjustments in 
binding of the non-local code will _already_ be made (i.e. for non-local 
code to be able to use the function arguments or the loop variable 
as if it was local code). This binding manipulation will have to 
be extended to also cater for the binding of control transfer functions.
But this may certainly make it too big a semantical change even for 
R3.
---
This said, and leaving the discussion of improved semantics aside, 
I think that the implementation of improved error causation as described 
by Brian in bug#1506 would be very worthwile and should be pursued.
Carl or anyone else with access to the R3 source is certainly the 
final authority on implementation overhead, but I'd like to add to 
the speculations: I think improved error causation should be possible 
with only adding minimal overhead, mostly to the non-local exit functions 
themselves.


If we have a task-local error handler stack (of setjmp/longjmp buffers, 
for example) it's mostly a matter of adding a few bits to those handlers 
describing wether they are valid targets for each class of non-local 
exit. Those bits must be correctly propagated/manipulated in functions 
which add their own handlers (the bit manipulation adds only negligible 
overhead). The non-local exit functions themselves check those bits, 
raising different errors if the use of the non-local exit function 
was invalid (as described in detail by Brian in the bug report). 
Those checks add a minimal overhead to each non-local exit function 
(RETURN, EXIT, BREAK, CONTINUE, THROW).
Paul
4-Mar-2010
[1221x4]
Thanks Cyphre!
Andreas, I still think that issue should be resolved by introducing 
another function since the cases where we would need such assurance 
is minimal in comparision to the overall application of attempt. 
 Why add more overhead to 98 precent of the cases when you can adapt 
to the 2 percent with another function or just implementing your 
own check and parsing.
I think dismissal is a good call on that one.
I'm referring to cc# 1506 btw.
Andreas
4-Mar-2010
[1225x2]
I don't think there would be any noticeable overhead.
I take it you are thinking about introducing an additional function 
similar to ATTEMPT that does what Sunanda desires in bug#1596?
BrianH
4-Mar-2010
[1227x5]
Andreas, I don't have a problem with that solution in principle. 
It's just that it wouldn't work, and wouldn't be task-safe. The handlers 
for those functions would be task-local, the code blocks not. Plus 
it would break code that uses code block references rather than nested 
blocks, code that uses those functions through function values, and 
any function with the [throw] attribute (which we will be getting 
back in R3 with different syntax), and all of those exist in R3 mezzanine 
code. Plus there's all the extra BIND/copy overhead added to every 
call to loop functions, startup code, etc., and don't think that 
you won't notice that because that can double the memory usage and 
executiion time, at least.


The solution I proposed in the ticket comments is to have DO, CATCH 
and the loops set a task-local flag in the interpreter state when 
the relevant functions become valid, and unset it when they become 
invalid, then have the functions check the flag at runtime before 
they do their work (which they could because they're all native). 
This would be task-safe, only add a byte of task-local memory overhead, 
plus the execution overhead of setting and getting bits in that byte 
in a task-local way. It's the execution overhead that we don't know 
about, whether it would be too much. It would certainly be less than 
your proposal though.
Carl is the authority on subtle implementation overhead, but for 
gross implementation overhead anyone can tell by just using the profiling 
tools and extraploating. And what you are proposing is definitely 
in the gross overhead category.
However, CATCH/name and THROW/name would need the additional memory 
overhead of a single block of words per task in the dynamic solution 
to store the currently handled names.
It might be hard to believe, but R3 has gotten so efficient that 
BIND/copy overhead is really noticeable now in comparison. In R2 
there were mezzanine loop functions like FORALL and FORSKIP that 
people often avoided using in favor of natives, even reorganizing 
their algorithms to allow using different loop functions like FOREACH 
or WHILE. Now that all loop functions in R3 are native speed, the 
FORALL and FORSKIP functions are preferred over FOREACH or FOR sometimes 
because FOREACH and FOR have BIND/copy overhead, and FORALL and FORSKIP 
don't. The functions without the BIND/copy overhead are much faster, 
particularly for small datasets and large amounts of code.
It's funny: While regular R3 code looks a lot like regular R2 code, 
optimized code looks a lot different because the balance of what 
is fast and what isn't has shifted. At least regular R3 code looks 
a lot more like optimized R3 code than regular R2 code looks like 
optimized R2 code. This is because we have been focusing on making 
the common, naive code patterns more optimized in R3, so that people 
don't have to do as much hand-optimization. The goal is to make it 
so that only writers of mezzanine and library code need to hand-optimize, 
and regular app developers can just use the optimized code without 
worrying about such things.