r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Core] Discuss core issues

Maxim
25-Apr-2006
[4089x5]
64 bit wide addressing/block lengths, etc, would allow liquid to 
scale more easily to petabyte sized management
where you have n number  of distributed/remote machines storing infinitely 
large amounts of tidbits of data, de centralised.
is anyone aware that inserting at the head of a block is EXTREMELY 
slow ?
just paste the following (inserting seems to be exponentially slow!) 
do [
	;----- appending -----
	print "^/^/^/==============="
	print "START APPENDING: one million times"
	blk: [] 
	s: now/precise
	loop 1000000 [
		insert tail blk none ; append
	]
	print ["completed: " difference now/precise s]
	
	;----- inserting -----
	print "^/==============="
	print "START INSERTING: ten thousand times"
	blk: [] 
	s: now/precise
	loop 10000 [
		insert blk none
	]
	print ["completed: " difference now/precise s]
	;----- inserting -----
	print "^/==============="
	print "START INSERTING: twenty thousand times"
	blk: [] 
	s: now/precise
	loop 20000 [
		insert blk none
	]
	print ["completed: " difference now/precise s]
]
shows: 
===============
START APPENDING: one million times
completed:  0:00:00.942
 
===============
START INSERTING: ten thousand times
completed:  0:00:00.47

===============
START INSERTING: twenty thousand times
completed:  0:00:01.863
Gabriele
25-Apr-2006
[4094x2]
insert at the head should be O(n), so inserting n times should be 
O(n^2)
the GC and reallocation may complicate things though. but that happens 
on appending too.
Maxim
25-Apr-2006
[4096]
why is inserting slower than appending?  arent blocks internally 
represented as linked-lists?
BrianH
25-Apr-2006
[4097]
No, that's the list! type. Blocks are arrays internally.
Maxim
25-Apr-2006
[4098x2]
but why are they faster at the tail?
I guess it because the alloc engine uses pools, and pre-allocates 
data larger than the empty block?
BrianH
25-Apr-2006
[4100]
Blocks are allocated in larger chunks that a single cell. That means 
that appends are usually just writes to a preallocated memory chunk.
Maxim
25-Apr-2006
[4101x2]
hehe
ok makes sense now.  still, I was surprise that preallocating a block 
of 1 million items was not that much quicker than starting with an 
empty block.
BrianH
25-Apr-2006
[4103]
The way you speed up appends is to force the block to preallocate 
at least as much memory as you think you will need right away by 
specifying that number as the second argument to MAKE.
Maxim
25-Apr-2006
[4104x2]
haha seems I'm reading your mind  ;-)
thanks for confirming this though  :-)
BrianH
25-Apr-2006
[4106x2]
If you really need to insert at the beginning, you can either use 
list! or think of your block in reverse and consider appends to by 
inserts at the beginning - that is the way stacks are typically implemented.
to by -> to be
Maxim
25-Apr-2006
[4108]
yess.
BrianH
25-Apr-2006
[4109]
No point to preallocating a list! though, unless it is to make sure 
you will have enough ram :)
Maxim
25-Apr-2006
[4110]
just for the record, I tried using lists, but in this stress test, 
they were quite memory intensive (compared to blocks) and eventually, 
the GC got slower at a rate quicker  than the speed improvement I 
did notice in the begining.  sooo as always, speed vs flexibility 
vs RAM still applies as always.
BrianH
25-Apr-2006
[4111]
Well, the links in the list! type are overhead, and the chunks of 
memory taken up by list nodes are smaller than those taken up by 
blocks, leading to greater memory fragmentation. REBOL doesn't have 
a compacting collector.
Maxim
25-Apr-2006
[4112]
I guess it at least impletment bottom and top allocation optimisation, 
based on chunk size?
BrianH
25-Apr-2006
[4113x2]
On the other hand, list nodes are allocated one at a time rather 
than in groups, so if you have a lot of small lists they may take 
less ram than a lot of small blocks. I don't know how many cells 
are allocated when the runtime (re)allocates a block - I think it 
is powers of two up to multiples of 128.
I may be wrong about that multiple value though - it might be 64.
Maxim
25-Apr-2006
[4115]
by my testing I'd say its 64, since creating 1 million empty blocks 
takes 64MBs.
BrianH
25-Apr-2006
[4116]
No, that's the element size that causes that. Each block element 
has to hold a 64bit value (of various types) plus some typing overhead. 
Plus I would bet that every block has a one element prefix to store 
the block lengths. Plus, there is the overhead of your reference 
to the block which would be a value in another block.
Maxim
25-Apr-2006
[4117]
right... I'm a bit tired ;-)
BrianH
25-Apr-2006
[4118x3]
What I was talking about earlier was the allocation quanta. Blocks 
are allocated with the assumption that you will want to insert stuff 
into them without having to reallocate them every time. So by that 
powers of two to multiples of 128, when you want a block length 10, 
you get 16. When you want 129, you get 256, and so on. On that note, 
when you want 1000000, you would get 1000064.
A list may be twice as big, but if you want memory predictability 
you can't beat it because the allocation quanta is 1 node and the 
insert/delete is O(1). Indexing and length calculations are O(n) 
though.
Later...
Maxim
25-Apr-2006
[4121]
ciao, thanks for all that info  :-)
Henrik
26-Apr-2006
[4122]
wouldn't it make sense for SKIP to support hex values? I'm trying 
to locate a specific position in a binary and it's tedious having 
to convert to number! every time.
Volker
26-Apr-2006
[4123]
what do you convert to number?
Gregg
26-Apr-2006
[4124]
What is a hex value in REBOL?
Henrik
26-Apr-2006
[4125]
volker, a specific location in the binary taken from a hex editor
Volker
26-Apr-2006
[4126x2]
but that conversion could be put in a little function by yourself?
i think non-system-coders rarely use hex.
Maxim
26-Apr-2006
[4128]
skip can be redefined to something which supports hex numbers...
Henrik
26-Apr-2006
[4129]
I just think it's too obvious and useful to leave out
Gregg
26-Apr-2006
[4130]
But what is the proper representation of a hex value in REBOL? An 
issue? An ENBASEd string?
Henrik
26-Apr-2006
[4131]
it's an issue
Volker
26-Apr-2006
[4132]
why not write a converter and then
  skip bin &h #c4d9
Gregg
26-Apr-2006
[4133]
Ah, but an issue is really a string, not a number. The TO-HEX function 
makes it look like this is the recommended approach (it's what I 
would say, too), but I think it's a type mismatch (unfortunately).
Henrik
26-Apr-2006
[4134]
I see
Gregg
26-Apr-2006
[4135x3]
That is, I don't think standard functions should take issue! params 
for hex values unless RT says, officially, that it's how you do hex 
values in REBOL.
Is hex notation something we should propose for R3? I'd bet money 
Carl thought about it long and hard in the original design, and may 
even have left some lexical wiggle-room to add it later.
propose = request.
Henrik
26-Apr-2006
[4138]
well, I can understand it if there is no official way to represent 
hex values. I think it should be proposed for R3