r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3 Extensions] REBOL 3 Extensions discussions

Maxim
21-Jul-2010
[1228x2]
the main disadvantage of these is that they block, but usually that's 
because the data has to be locked anyways while some code reliquishes 
control
Brian, my callbacks are thread safe in regards to tasks and threads 
(I read extensively on the subject when I did my hack).


but I talked with Carl, and in fact they are not safe within a single 
process to begin with
BrianH
21-Jul-2010
[1230]
Be sure to talk it over with Carl tomorrow. We will be coordinating 
on module stuff.
Maxim
21-Jul-2010
[1231x2]
the way I see it is that for callbacks to be safe, the GC has to 
be momentarily locked.  otherwise memory swapping might occur within 
the core which isn't synchronised in the command (you get some pointers 
from the core) and those pointers might not exist anymore when the 
callback returns.
so you end up changing memory which has been recycled and asking 
for trouble.
BrianH
21-Jul-2010
[1233]
Plus there are changes that could be happening in other tasks in 
R3 as well.
Maxim
21-Jul-2010
[1234x2]
to make it work, carl would have to make a purpose-built C function 
which makes sure that memory used by a command is safely preserved 
for the length of the command, or something like that.
yes, shared memory with threads (if this is how R3 currently does 
it) always requires some form of locking.


again, the GC would need to cooperate in this process... maybe having 
explicit locking or semaphores, just like the Amiga did it.
BrianH
21-Jul-2010
[1236x2]
We have a task! type now, and have had it for a while. It doesn't 
work well but when last I checked it does work. Of course none of 
the system supports its use, and error handling in a task will crash 
R3, but that all is fixable.
I am hoping for a certain level of non-shared memory and intertask 
communication, but we'll have to see how much we can get away with.
Maxim
21-Jul-2010
[1238x3]
yes this has to be there for safe multithreading.
if the core where based on copy on set then it would be pretty safe, 
but alas, we have mutable series.
I'd build a device model for the tasks, this way it would be async. 
 just don't inlude stupid OS based limitatios like in python, where 
the thread which creates a thread, has no way of forcefull closing 
it.  so if a tread hangs.  your whole app can hang with it.  :-(
BrianH
21-Jul-2010
[1241x3]
Tssk-local on set?
Task-local
Kind of a linear thing. If a task takes ownership, it becomes copy-on-write 
for other tasks.
Maxim
21-Jul-2010
[1244x2]
internally that can work, but the issue arises when two threads share 
a series. if they write at the same time, even with a copy, things 
like indexes might not update atomically to the copy operation.
its the same issue we have with commands and callbacks... there is 
no data "ownership" AFAICt.
BrianH
21-Jul-2010
[1246]
That is the kind of thing that we should avoid. As it is, system/contexts/user 
is supposed to be task-local. Getting the rest of the functions so 
they aren't self-modifying shouldn't be difficult, especially since 
I have been keeping that in mind when I work over the mezzanines.
Maxim
21-Jul-2010
[1247x2]
the way I see it, when a thread is launched, it should Copy the whole 
environment and become completely independent.  data exchange between 
tasks is done explicitely, and no GC allocated data should be able 
to cross thread boundaries.  obviously using external libs you may 
have some fun, but then its OS allocated and not managed by the core, 
so its your call.
utility functions like  synchronise would be nice. but they would 
be something you call manually.   I find its always better when the 
code actually implements the paradigm, rather than trying to hide 
it.   a bit like Apple's GDC.  you explicitely create little tasks 
which you launch and wait for completion.  its simple, obvious and 
highly scalable.
BrianH
21-Jul-2010
[1249x3]
I don't like the thread model, but I think that it is the one we 
have. We should have fast support for interprocess communication, 
but it should also seamlessly work for intertask communication. Fortunately 
most of our standard datastructures are implemented through the datatype 
model, which means that their implementations can more easily be 
switched to lockless methods, or at least reduce locking.
The advantage to the thread-like tasking model is that the built-in 
functions and data structures can be made copy-on-write, or copy-on-UNPROTECT. 
Protectecd series can be sharable.
A model with some shared data can be more memory-friendly on small 
devices too. While the native code can just use shared pages, the 
mezzanine code needs to be per-process. And from a memory standpoint, 
mezzanine code takes up a lot more room than native code. Shared-nothing 
would kill R3 on portable devices.
Maxim
21-Jul-2010
[1252]
native is shared, but not the data which is being fed into it.
BrianH
21-Jul-2010
[1253]
Most of REBOL's memory usage is that data. The mezzanine code is 
that data. For that matter, I am being asked to implement delayed 
modules so that non-native code can be shared in its source form 
for as long as possible, delaying loading it into regular REBOL data 
unless absolutely necessary.
Maxim
21-Jul-2010
[1254]
but the loaded function is now immutable in R3, so it can be shared, 
as long as there are no "static" or global variables in the code. 
as opposed to a string being hacked away within that function.
BrianH
21-Jul-2010
[1255]
Only if the series and other data structures in its code block are 
marked copy-on-write.
Maxim
21-Jul-2010
[1256]
yep. that's what I meant by static variables.
BrianH
21-Jul-2010
[1257]
Cool, this sounds doable, with some tweaks we haven't thought of, 
of course. Getting back to the new module importer now.
Maxim
21-Jul-2010
[1258x2]
it would also be nice to have something like GDC in the thread API. 
 something specifically designed for short-running bursts.  something 
like....

draw this image, read this web page, etc. this way, an app could 
divide a process into parrallel tasks, linked in a port or something 
so that you just need to wait on completion of some or all tasks.
breakup rendering of an sequence into several threads and instead 
of messaging long running threads and having to build a complex application... 
you just something like

render-sequence: func [data frames] [ 
	repeat frame frames [
		dispatch 'pic-renders img: image-render data frame
	]  
	wait pic-renders 
	dispatch 'encode-movie do-encode %movie.mp4 img
]


the thing here is that the dispatch setup is external to the source 
code using bursts, and could be configured to use more or less threads.

you've got two independent bursts with a dependency
Pekr
21-Jul-2010
[1260x2]
As for callbacks - I surely don't understand the issue properly, 
but in overall - we are a messaging language. We should aim for message 
passing based kernel, as Amiga was, and as QNX is. Let's use events, 
ports for IPC, when appropriate ...


As for tasks - my understanding is, that Carl still plans on threads 
internally. Didn't BrianH said yesterday, that in modern/future multicore 
architectures OS tasks are better way to go?
Carl - if you do tasks, we might get Cheyenne ported to r3 :-)
Maxim
21-Jul-2010
[1262]
note here that we DON'T wait for the second dispatch to finish, we 
just make sure its got all the frames before  dispatching it.


then you could have a burst config which allocates different number 
of threads based on dispatch labels... maybe there's 4 rendering 
threads and only one encoding thread.

but you could dispatch that  function  too...

render-scene: func [shots][
	foreach shot shots [
		dispatch render-sequence shot/data shot/length
	]
]


so here you just send the whole scene to render, and it would only 
use the allocated number of 'pic-render threads allowed accross all 
shots.  :-)
Pekr
21-Jul-2010
[1263]
Max - we should "copy" Amiga/QNX, and not a Python - simply put - 
REBOL uses its own OS-like advanced mechanisms - ports, devices. 
Codecs are turning back to ports too. And as for tasks and IPC, we 
should not just wrap OS, but use the REBOL way of doing things once 
again ....
Maxim
21-Jul-2010
[1264x3]
with a few polling function it could be really nice:


completed-bursts 'pic-renders  ; reports progress done on overall 
pic-renders for all shots.

cancel-bursts 'pic-renders ; stops all pic-renders and bursts which 
depend on them.

interrupt-bursts 'pic-rendrs ; puts bursts (and all depencies) into 
a wait list, to be restarted later.
this is NOT os way, its an api over a thread model which doesnt require 
the programer to know about tasks and deal with messaging, IPC, bla 
bla.


in the vast majority of cases, threads don't need to be persistent, 
and is in fact a burden to have to manage.
using OS threads is the only way we can use multiple hardware threads 
in the same application.  otherwise, they will be completely different 
processes, and that makes a BIG difference in memory consumption, 
especially on GUI Oses which open up a lot of libs on startup.
Pekr
21-Jul-2010
[1267]
Max - I was not talking about your proposal, that sounds OK. I was 
just referring to your Python example, where it seems to be just 
simple OS wrapper ....
Gregg
21-Jul-2010
[1268]
What we probably should do is designate a PORT type for intertask 
messaging, and make it easy to use.

Yes please. :-)
Graham
21-Jul-2010
[1269]
Max, where is this parser for C header files that you're releasing/released 
today?
Maxim
21-Jul-2010
[1270]
I'm hip deep in it right now...  I'm implementing the last "feature" 
which is the ability to format command arguments differently than 
the original C function parameters.


this will allow templating for extensions, just like in C++, and 
will also allow us to put litterals in the spec, so that one doesn't 
need to provide ALL parameters from the REBOL side.
Graham
21-Jul-2010
[1271]
hip deep = 50% done?
Maxim
21-Jul-2010
[1272]
this C header file:

//---------------------------------
// r3t_integer_add
//
// test: print [ r3t-integer-add 1 0 " > expecting: " 1]
// test: print [ r3t-integer-add 2 2 " > expecting: " 4]
// test: print [ r3t-integer-add 2 3 " > expecting: " 5]
// test: print [ r3t-integer-add 0 0 " > expecting: " 0]
// command-format: [object!]
extern int r3t_integer_add(int a, int b)


will tell the tool, to provide an object interface to the function 
rather than to expect two integers.
Graham
21-Jul-2010
[1273]
or do you have reallly long legs?
Maxim
21-Jul-2010
[1274x4]
the engine, without this feature was working 100% .  but its now 
undergoing open-heart surgery so its not currently working... no 
point in releasing just right now.
there is also a lot of work to do in order to provide support for 
more datatypes, but that can be done concurrently by a few of us, 
once the tools is public.
I didn't release the other version, because this formatting changes 
the generator algorythm, so that any work people would do to extend 
the tool, would have to be re-coded.
I also want to add A LOT of comments to make the source more litterate. 
 in fact I expect the source to contain more comment bytes than code.