World: r3wp

Join the discussions in the REBOL3 world...

[Core] Discuss core issues

older newer	first last
eFishAnt 25-Apr-2006 [4067]	(also, you might find ways to structure your data in REBOL which reduce the number of block...like make them into structured objects ... just a wild guess, maybe that would compact it more?)
Maxim 25-Apr-2006 [4068x3]	objects take way more... hehe have you ever even tried to allocate 1 million objects which have 40 methods. REBOL won't even be able to allocate 20 thousand of them.
	I had to use shared methods (like face/feel)
	and can still allocate 20000 nodes a second on my trusty 1.5GHz laptop.
eFishAnt 25-Apr-2006 [4071]	does an array help?
Maxim 25-Apr-2006 [4072x2]	do you mean the array function?
Maxim 25-Apr-2006 [4072x2]	cause it allocates blocks ;-)
eFishAnt 25-Apr-2006 [4074]	seems like some improvements could be made by storing more things in each block.
Maxim 25-Apr-2006 [4075]	each node is an atom.
eFishAnt 25-Apr-2006 [4076]	but it seems it would be a speed - size tradeoff.
Maxim 25-Apr-2006 [4077x3]	and needs its own block to determine its dependencies.
	note that I am allocating 1 million objects... we are talking enterprise solutions here.
	with more ram I can go to 10 million... that's without data...
eFishAnt 25-Apr-2006 [4080]	the fallacy of db-centric programming ... is having data centrialized, rather than distributed. If you split it apart into separate messaging agents, then you might be able to reduce the bottlenecks created by having the data cetralized.
Maxim 25-Apr-2006 [4081x5]	with bedrock, each cell, is its own individual atom. there are no tables.
	each cell is a store of associations this cell has with other things in the db. being bi-directional nodes, I can create sets or groups of nodes and nodes can backtrack them to query related informations.
	so in theory, when I add offloading of nodes (to disk or remote computers) within liquid, we'll be able to scale the db infinitely.
	well, limited by 64 bits of addressing at a time.
	(thats if R3 has a 64bit mode)
eFishAnt 25-Apr-2006 [4086x2]	bits or bytes?
eFishAnt 25-Apr-2006 [4086x2]	oh, is that what you are worried about? I thought you were saying an empty block takes 64 bytes.
Maxim 25-Apr-2006 [4088x6]	two different issues.
	64 bit wide addressing/block lengths, etc, would allow liquid to scale more easily to petabyte sized management
	where you have n number of distributed/remote machines storing infinitely large amounts of tidbits of data, de centralised.
	is anyone aware that inserting at the head of a block is EXTREMELY slow ?
	just paste the following (inserting seems to be exponentially slow!) do [ ;----- appending ----- print "^/^/^/===============" print "START APPENDING: one million times" blk: [] s: now/precise loop 1000000 [ insert tail blk none ; append ] print ["completed: " difference now/precise s] ;----- inserting ----- print "^/===============" print "START INSERTING: ten thousand times" blk: [] s: now/precise loop 10000 [ insert blk none ] print ["completed: " difference now/precise s] ;----- inserting ----- print "^/===============" print "START INSERTING: twenty thousand times" blk: [] s: now/precise loop 20000 [ insert blk none ] print ["completed: " difference now/precise s] ]
	shows: =============== START APPENDING: one million times completed: 0:00:00.942 =============== START INSERTING: ten thousand times completed: 0:00:00.47 =============== START INSERTING: twenty thousand times completed: 0:00:01.863
Gabriele 25-Apr-2006 [4094x2]	insert at the head should be O(n), so inserting n times should be O(n^2)
Gabriele 25-Apr-2006 [4094x2]	the GC and reallocation may complicate things though. but that happens on appending too.
Maxim 25-Apr-2006 [4096]	why is inserting slower than appending? arent blocks internally represented as linked-lists?
BrianH 25-Apr-2006 [4097]	No, that's the list! type. Blocks are arrays internally.
Maxim 25-Apr-2006 [4098x2]	but why are they faster at the tail?
Maxim 25-Apr-2006 [4098x2]	I guess it because the alloc engine uses pools, and pre-allocates data larger than the empty block?
BrianH 25-Apr-2006 [4100]	Blocks are allocated in larger chunks that a single cell. That means that appends are usually just writes to a preallocated memory chunk.
Maxim 25-Apr-2006 [4101x2]	hehe
Maxim 25-Apr-2006 [4101x2]	ok makes sense now. still, I was surprise that preallocating a block of 1 million items was not that much quicker than starting with an empty block.
BrianH 25-Apr-2006 [4103]	The way you speed up appends is to force the block to preallocate at least as much memory as you think you will need right away by specifying that number as the second argument to MAKE.
Maxim 25-Apr-2006 [4104x2]	haha seems I'm reading your mind ;-)
Maxim 25-Apr-2006 [4104x2]	thanks for confirming this though :-)
BrianH 25-Apr-2006 [4106x2]	If you really need to insert at the beginning, you can either use list! or think of your block in reverse and consider appends to by inserts at the beginning - that is the way stacks are typically implemented.
BrianH 25-Apr-2006 [4106x2]	to by -> to be
Maxim 25-Apr-2006 [4108]	yess.
BrianH 25-Apr-2006 [4109]	No point to preallocating a list! though, unless it is to make sure you will have enough ram :)
Maxim 25-Apr-2006 [4110]	just for the record, I tried using lists, but in this stress test, they were quite memory intensive (compared to blocks) and eventually, the GC got slower at a rate quicker than the speed improvement I did notice in the begining. sooo as always, speed vs flexibility vs RAM still applies as always.
BrianH 25-Apr-2006 [4111]	Well, the links in the list! type are overhead, and the chunks of memory taken up by list nodes are smaller than those taken up by blocks, leading to greater memory fragmentation. REBOL doesn't have a compacting collector.
Maxim 25-Apr-2006 [4112]	I guess it at least impletment bottom and top allocation optimisation, based on chunk size?
BrianH 25-Apr-2006 [4113x2]	On the other hand, list nodes are allocated one at a time rather than in groups, so if you have a lot of small lists they may take less ram than a lot of small blocks. I don't know how many cells are allocated when the runtime (re)allocates a block - I think it is powers of two up to multiples of 128.
BrianH 25-Apr-2006 [4113x2]	I may be wrong about that multiple value though - it might be 64.
Maxim 25-Apr-2006 [4115]	by my testing I'd say its 64, since creating 1 million empty blocks takes 64MBs.
BrianH 25-Apr-2006 [4116]	No, that's the element size that causes that. Each block element has to hold a 64bit value (of various types) plus some typing overhead. Plus I would bet that every block has a one element prefix to store the block lengths. Plus, there is the overhead of your reference to the block which would be a value in another block.
older newer	first last