r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3]

BrianH
4-May-2011
[8362x2]
There isn't much of a security page right now, though it would be 
a good idea to make one if only to document the stuff that doesn't 
currently work (like SECURE in the last 4 versions). I don't know 
if anyone else has made a concerted effort to attack REBOL and then 
fix the security problems found.
I would love it if we as a community were to really think through 
the (UN)PROTECT model, because the current model is incomplete (even 
for the stuff that works) and the proposed model is starting to look 
a bit awkward to use. Keep in mind that PROTECT may also be used 
to make series sharable among tasks, but that this isn't implemented 
and there is likely a better way to do this. I would love it if there 
was a good security model that can integrate well with REBOL semantics.
Kaj
4-May-2011
[8364]
Capabilities
BrianH
4-May-2011
[8365]
Won't work within a process, only on a process boundary.
Kaj
4-May-2011
[8366]
Depends on if you make it work
BrianH
4-May-2011
[8367]
It's inherent in the semantics of REBOL, a side effect of the code-vs-data 
thing.
Kaj
4-May-2011
[8368]
Do you know the Genode architecture?
BrianH
4-May-2011
[8369]
That might work for SECURE but not for (UN)PROTECT.
Kaj
4-May-2011
[8370]
Why not?
BrianH
4-May-2011
[8371]
(I am trying to write a long *starting* message here and have to 
put it in the clipboard to answer these questions, sorry.)
Kaj
4-May-2011
[8372]
That's OK, I'm interested in your opinion. I haven't formulated an 
answer for myself yet
BrianH
4-May-2011
[8373]
Some factors to consider about the REBOL semantic limitations:


- There is no such thing as trusted-vs-untrusted code in a REBOL 
process, nor can there be, really. Levels of trust need to be on 
a process boundary. You can't (even hypothetically) do LOAD/secure 
level or DO/secure level, but you can do LAUNCH/secure level.


- If you want to make something readable or writeable to only certain 
code within a process, binding visibility tricks are the only way 
to do it. The only way to ensure that your code has access to something 
and other code doesn't is to make sure that other code can't even 
see yours. This is why BODY-OF function returns an unbound copy of 
the body in R3, not the original.


- We need a way to make protection stick so you can't unprotect things 
that are protected, or protect things that need to stay unprotected, 
but still allow changes to the protection status of other stuff. 
The currently proposed model does this through a chain of PROTECT 
and UNPROTECT calls, then a PROTECT/lock, not allowing unlocking 
if there is a SECURE 'protect. However, the proposed model seems 
too difficult to use, and as the pre-110 module system demonstrated, 
people won't use something that is too complex to use, or will use 
it badly. We need a better way of specifying this stuff.
Kaj
4-May-2011
[8374x3]
OK, that's the current REBOL model, but you asked about alternative 
models. Capabilities are not about trust levels, but about capability 
tokens. They're meant to take trust out of the equation
Trying to hammer every hole shut with SECURE and PROTECT is the classic 
method of sticking all your fingers in the dike. When you run out 
of fingers for all the holes, the water comes gushing in. Capabilities 
are about making it impossible to get through the next dike. It's 
a different way of compartmentalising
An E language fan once visited Carl to explain to him that true capabilities 
can be implemented in REBOL very well. Carl apparently rejected it 
based on complexity, but if the problem with the current new R3 method 
is rising complexity, maybe this decision is worth reviewing
BrianH
4-May-2011
[8377x4]
Now, for your questions, Kaj.


Mezzanines execute arbitrary code with DO. You can't even know if 
something is code or not until you pass it to a dialect interpreter 
like DO or PARSE - code is data. Blocks don't have bindings, only 
their any-word contents do, so the code blocks of functions are not 
bound to functions, only their contents are. The same goes for functions 
in modules or objects - they aren't bound to their objects or modules, 
only referenced by them.


(making this up on the fly) It could be possible to make the binding 
visibility of words be defined as a set of capability tokens as part 
of the object spec (in the SPEC-OF sense), and have the function 
spec dialect be extended to contain such tokens. This would have 
to be checked with every word access, and we would have to be careful 
to make the model in such a way to avoid unauthorized privilege escalation. 
Then changes in capabilities would happen on the function call stack, 
which is task-specific.


The problem with this is making sure code can't make functions with 
more capabilities than the code making them currently possesses. 
Though R3 doesn't really have a user model, it does have a task model 
and we could make the capability level task-specific. Code could 
constrain capabilities for code it calls, but we don't want privilege 
escalation at function creation time. It would be possible to have 
privilege escalation at function call time if the function called 
was created by something with the necessary capabilities.

Drawbacks:

- If we do this for binding visibility, this means a capabilities 
check would go into every word access. Word access would be SLOW.

- This doesn't add anything to the PROTECT/hide model, which handles 
binding visibility without the slowdown.


Capabilities would be like the SECURE model, but more flexible, so 
that's something to consider there. What SECURE protects is heavy 
enough that a capabilities check wouldn't add much to the overhead.
Remember, R3 currently has three separate security models: SECURE, 
(UN)PROTECT, and PROTECT/hide.
Of the 3, SECURE seems like the most likely to be enhanceable with 
capabilities. Functions could be enhanced by capabilities specs, 
where the function code could only create other functions of equal 
to or lesser capabilities than are currently available in the call 
stack. Once a function is created, it could run code with the capabilities 
that it was created with (with the exception of that function creation 
limitation earlier). There could be a function like DO that reduces 
capabilities and then does a block of code, and maybe MAKE module! 
could be made to use that function based on capabilities in the module 
spec.
Since MAKE object! isn't a hybrid function like MAKE module! (which 
calls sys/make-module*), we probably don't want to reduce capabilities 
on a per-object basis.
Kaj
4-May-2011
[8381x3]
It seems to me that you are still talking in terms of plugging all 
the holes in the myriad of capability that would supposedly be around. 
This is not how true capabilities work. They implement POLA: there 
is no capability unless it is needed, and in that case, it needs 
to be handed down as a token by the assigner of the work. If the 
boss doesn't have the token, the employee will by definition not 
be able to do the work
REBOL is a virtual machine with strong typing (as long as extensions 
are protected well enough). You have complete control over the world 
the code executes in, so the potential is there to make the process/thread 
separation irrelevant for security
I don't see why capabilities would need to be checked on every word 
access. The critical point is the binding, and REBOL uses this well 
to optimise word access. Capabilities would need to be determined 
at binding time, so that binding will fail if the required capability 
token isn't available
BrianH
4-May-2011
[8384]
Three security models:
- SECURE protects access to external resources.
- (UN)PROTECT protects changeability of internal structures.
- PROTECT/hide manages binding visibility.

We don't jsut need to protect files, we need to protect things like 
passwords in memory, access to capability tokens, etc.
Kaj
4-May-2011
[8385x2]
Which can all be done in a capabilities model
Have you studied the E language, and Genode for that matter?
BrianH
4-May-2011
[8387]
If you use capability tokens to protect binding visibility, then 
every word access would need to check against a capability token.
Kaj
4-May-2011
[8388]
I still don't see that. Binding doesn't change on every access; that's 
the point of this optimisation
BrianH
4-May-2011
[8389]
Binding visibility, not binding change.
Kaj
4-May-2011
[8390]
First you have visibility, than binding, than access. Why go through 
all those stages on each access?
BrianH
4-May-2011
[8391]
OK, let's work this through for only PROTECT/hide to see how the 
concept would affect things. PROTECT/hide works by making it so you 
can't make new bindings to a word - that way words that are already 
bound can be accessed without extra overhead. Adding capabilities 
to this means that you could create new bindings to the word if you 
had the token, but not if you didn't. However, with PROTECT/hide 
(currently) the already bound words don't get unbound when they are 
hidden, just new bindings to that word, and if you have access to 
such a prebound word value then you can make new words with that 
binding using TO, which effectively makes prebound words into their 
own capability tokens. So PROTECT/hide *as it is now* could be the 
basis of a capability system.
Kaj
4-May-2011
[8392]
Cool :-)
BrianH
4-May-2011
[8393]
The problem that a capability system has of making sure capability 
tokens don't leak is pretty comparable to the problem with leaking 
bindings that we already have to take into account with the PROTECT/hide 
model, so switching to a capability system for that model gains us 
nothing that we don't have already. And we've already solved many 
leaking binding problems by doing things like having BODY-OF function 
returning an unbound copy of its code block rather than the original. 
The PROTECT/hide model works pretty well for that, so it's just a 
matter of closing any remaining holes and making sure things are 
stable.
Kaj
4-May-2011
[8394]
The fundamental gain is that you switch to a POLA model from the 
current model where all code in a REBOL process has all capabilities 
unless you manage to stop some of them
BrianH
4-May-2011
[8395]
For PROTECT/hide we already have that. So let's move on to the other 
security models.
Kaj
4-May-2011
[8396]
Does all code get created PROTECT/hidden?
BrianH
4-May-2011
[8397]
No, but all code created after the word is hidden doesn't get access, 
and only code created before the hiding has access to a token (bound 
word) that will let it create new code with access. You get the same 
sharp separation between code with access and code without.
Kaj
4-May-2011
[8398]
A POLA model is where you start out with no access. If you have to 
PROTECT/HIDE afterwards, that's the reverse of POLA
BrianH
4-May-2011
[8399]
Basically, the code that creates the token is the only code that 
has access to the token, and it can pass that token along to other 
code if it is safe to do so. The only difference is that code isn't 
protected unless it needs to be.
Kaj
4-May-2011
[8400]
Yes, the reverse of POLA. Capabilities is about building a POLA system
BrianH
4-May-2011
[8401]
For the benefit of those trying to follow this discussion without 
having read the articles, could you at least once expand the POLA 
acronym?
Kaj
4-May-2011
[8402]
Principle Of Least Access
BrianH
4-May-2011
[8403]
Thanks.
Kaj
4-May-2011
[8404]
Again, did you study true capabilities, especially in the E language, 
but also in Genode and the ground-breaking KeyKos and EROS systems? 
If you didn't, I can understand why we don't understand each other. 
By the way, POLA is not a capabilities term, but a generic security 
term
BrianH
4-May-2011
[8405]
I got the overview, but there are some limitations when talking about 
a language like REBOL.
Kaj
4-May-2011
[8406x2]
There are limitations in the current implementation, but not in the 
concept
Please study http://erights.organd http://genode.org.Without that, 
this discussion probably won't go anywhere
BrianH
4-May-2011
[8408x3]
Not so.
OK, the problem with that model *in this case* (PROTECT/hide) is 
that we are talking about object fields here, bindings of word values. 
REBOL objects bind their code blocks before executing them. If there 
is going to be any blocking of bindings, at least the object's own 
code needs to be bound first. This means that if you are going to 
make word bindings hidden, you need to do so after the object itself 
has been made, or at least after its code block has been bound. You 
can do this binding with PROTECT/hide, or with some setting in an 
object header, it doesn't matter. Since words are values and their 
bindings are static, being able to create a new word with the same 
binding means that you need access to a word with that binding, with 
*exactly* the same visibility issues as token access. The difference 
in this case between POLA and PROTECT/hide is whether object fields 
are hidden by default or not, not a matter of when they are hidden.
We can't directly use the E model because E is compiled, so there 
are things that happen at runtime in REBOL that happen in the compiler 
in E.
Kaj
4-May-2011
[8411]
It's a VM, so you still have control over them