World: r3wp

Join the discussions in the REBOL3 world...

[Core] Discuss core issues

older newer	first last
eFishAnt 25-Apr-2006 [4080]	the fallacy of db-centric programming ... is having data centrialized, rather than distributed. If you split it apart into separate messaging agents, then you might be able to reduce the bottlenecks created by having the data cetralized.
Maxim 25-Apr-2006 [4081x5]	with bedrock, each cell, is its own individual atom. there are no tables.
	each cell is a store of associations this cell has with other things in the db. being bi-directional nodes, I can create sets or groups of nodes and nodes can backtrack them to query related informations.
	so in theory, when I add offloading of nodes (to disk or remote computers) within liquid, we'll be able to scale the db infinitely.
	well, limited by 64 bits of addressing at a time.
	(thats if R3 has a 64bit mode)
eFishAnt 25-Apr-2006 [4086x2]	bits or bytes?
eFishAnt 25-Apr-2006 [4086x2]	oh, is that what you are worried about? I thought you were saying an empty block takes 64 bytes.
Maxim 25-Apr-2006 [4088x6]	two different issues.
	64 bit wide addressing/block lengths, etc, would allow liquid to scale more easily to petabyte sized management
	where you have n number of distributed/remote machines storing infinitely large amounts of tidbits of data, de centralised.
	is anyone aware that inserting at the head of a block is EXTREMELY slow ?
	just paste the following (inserting seems to be exponentially slow!) do [ ;----- appending ----- print "^/^/^/===============" print "START APPENDING: one million times" blk: [] s: now/precise loop 1000000 [ insert tail blk none ; append ] print ["completed: " difference now/precise s] ;----- inserting ----- print "^/===============" print "START INSERTING: ten thousand times" blk: [] s: now/precise loop 10000 [ insert blk none ] print ["completed: " difference now/precise s] ;----- inserting ----- print "^/===============" print "START INSERTING: twenty thousand times" blk: [] s: now/precise loop 20000 [ insert blk none ] print ["completed: " difference now/precise s] ]
	shows: =============== START APPENDING: one million times completed: 0:00:00.942 =============== START INSERTING: ten thousand times completed: 0:00:00.47 =============== START INSERTING: twenty thousand times completed: 0:00:01.863
Gabriele 25-Apr-2006 [4094x2]	insert at the head should be O(n), so inserting n times should be O(n^2)
Gabriele 25-Apr-2006 [4094x2]	the GC and reallocation may complicate things though. but that happens on appending too.
Maxim 25-Apr-2006 [4096]	why is inserting slower than appending? arent blocks internally represented as linked-lists?
BrianH 25-Apr-2006 [4097]	No, that's the list! type. Blocks are arrays internally.
Maxim 25-Apr-2006 [4098x2]	but why are they faster at the tail?
Maxim 25-Apr-2006 [4098x2]	I guess it because the alloc engine uses pools, and pre-allocates data larger than the empty block?
BrianH 25-Apr-2006 [4100]	Blocks are allocated in larger chunks that a single cell. That means that appends are usually just writes to a preallocated memory chunk.
Maxim 25-Apr-2006 [4101x2]	hehe
Maxim 25-Apr-2006 [4101x2]	ok makes sense now. still, I was surprise that preallocating a block of 1 million items was not that much quicker than starting with an empty block.
BrianH 25-Apr-2006 [4103]	The way you speed up appends is to force the block to preallocate at least as much memory as you think you will need right away by specifying that number as the second argument to MAKE.
Maxim 25-Apr-2006 [4104x2]	haha seems I'm reading your mind ;-)
Maxim 25-Apr-2006 [4104x2]	thanks for confirming this though :-)
BrianH 25-Apr-2006 [4106x2]	If you really need to insert at the beginning, you can either use list! or think of your block in reverse and consider appends to by inserts at the beginning - that is the way stacks are typically implemented.
BrianH 25-Apr-2006 [4106x2]	to by -> to be
Maxim 25-Apr-2006 [4108]	yess.
BrianH 25-Apr-2006 [4109]	No point to preallocating a list! though, unless it is to make sure you will have enough ram :)
Maxim 25-Apr-2006 [4110]	just for the record, I tried using lists, but in this stress test, they were quite memory intensive (compared to blocks) and eventually, the GC got slower at a rate quicker than the speed improvement I did notice in the begining. sooo as always, speed vs flexibility vs RAM still applies as always.
BrianH 25-Apr-2006 [4111]	Well, the links in the list! type are overhead, and the chunks of memory taken up by list nodes are smaller than those taken up by blocks, leading to greater memory fragmentation. REBOL doesn't have a compacting collector.
Maxim 25-Apr-2006 [4112]	I guess it at least impletment bottom and top allocation optimisation, based on chunk size?
BrianH 25-Apr-2006 [4113x2]	On the other hand, list nodes are allocated one at a time rather than in groups, so if you have a lot of small lists they may take less ram than a lot of small blocks. I don't know how many cells are allocated when the runtime (re)allocates a block - I think it is powers of two up to multiples of 128.
BrianH 25-Apr-2006 [4113x2]	I may be wrong about that multiple value though - it might be 64.
Maxim 25-Apr-2006 [4115]	by my testing I'd say its 64, since creating 1 million empty blocks takes 64MBs.
BrianH 25-Apr-2006 [4116]	No, that's the element size that causes that. Each block element has to hold a 64bit value (of various types) plus some typing overhead. Plus I would bet that every block has a one element prefix to store the block lengths. Plus, there is the overhead of your reference to the block which would be a value in another block.
Maxim 25-Apr-2006 [4117]	right... I'm a bit tired ;-)
BrianH 25-Apr-2006 [4118x3]	What I was talking about earlier was the allocation quanta. Blocks are allocated with the assumption that you will want to insert stuff into them without having to reallocate them every time. So by that powers of two to multiples of 128, when you want a block length 10, you get 16. When you want 129, you get 256, and so on. On that note, when you want 1000000, you would get 1000064.
	A list may be twice as big, but if you want memory predictability you can't beat it because the allocation quanta is 1 node and the insert/delete is O(1). Indexing and length calculations are O(n) though.
	Later...
Maxim 25-Apr-2006 [4121]	ciao, thanks for all that info :-)
Henrik 26-Apr-2006 [4122]	wouldn't it make sense for SKIP to support hex values? I'm trying to locate a specific position in a binary and it's tedious having to convert to number! every time.
Volker 26-Apr-2006 [4123]	what do you convert to number?
Gregg 26-Apr-2006 [4124]	What is a hex value in REBOL?
Henrik 26-Apr-2006 [4125]	volker, a specific location in the binary taken from a hex editor
Volker 26-Apr-2006 [4126x2]	but that conversion could be put in a little function by yourself?
Volker 26-Apr-2006 [4126x2]	i think non-system-coders rarely use hex.
Maxim 26-Apr-2006 [4128]	skip can be redefined to something which supports hex numbers...
Henrik 26-Apr-2006 [4129]	I just think it's too obvious and useful to leave out
older newer	first last