World: r3wp
[Core] Discuss core issues
older newer | first last |
Maxim 25-Apr-2006 [4089x5] | 64 bit wide addressing/block lengths, etc, would allow liquid to scale more easily to petabyte sized management |
where you have n number of distributed/remote machines storing infinitely large amounts of tidbits of data, de centralised. | |
is anyone aware that inserting at the head of a block is EXTREMELY slow ? | |
just paste the following (inserting seems to be exponentially slow!) do [ ;----- appending ----- print "^/^/^/===============" print "START APPENDING: one million times" blk: [] s: now/precise loop 1000000 [ insert tail blk none ; append ] print ["completed: " difference now/precise s] ;----- inserting ----- print "^/===============" print "START INSERTING: ten thousand times" blk: [] s: now/precise loop 10000 [ insert blk none ] print ["completed: " difference now/precise s] ;----- inserting ----- print "^/===============" print "START INSERTING: twenty thousand times" blk: [] s: now/precise loop 20000 [ insert blk none ] print ["completed: " difference now/precise s] ] | |
shows: =============== START APPENDING: one million times completed: 0:00:00.942 =============== START INSERTING: ten thousand times completed: 0:00:00.47 =============== START INSERTING: twenty thousand times completed: 0:00:01.863 | |
Gabriele 25-Apr-2006 [4094x2] | insert at the head should be O(n), so inserting n times should be O(n^2) |
the GC and reallocation may complicate things though. but that happens on appending too. | |
Maxim 25-Apr-2006 [4096] | why is inserting slower than appending? arent blocks internally represented as linked-lists? |
BrianH 25-Apr-2006 [4097] | No, that's the list! type. Blocks are arrays internally. |
Maxim 25-Apr-2006 [4098x2] | but why are they faster at the tail? |
I guess it because the alloc engine uses pools, and pre-allocates data larger than the empty block? | |
BrianH 25-Apr-2006 [4100] | Blocks are allocated in larger chunks that a single cell. That means that appends are usually just writes to a preallocated memory chunk. |
Maxim 25-Apr-2006 [4101x2] | hehe |
ok makes sense now. still, I was surprise that preallocating a block of 1 million items was not that much quicker than starting with an empty block. | |
BrianH 25-Apr-2006 [4103] | The way you speed up appends is to force the block to preallocate at least as much memory as you think you will need right away by specifying that number as the second argument to MAKE. |
Maxim 25-Apr-2006 [4104x2] | haha seems I'm reading your mind ;-) |
thanks for confirming this though :-) | |
BrianH 25-Apr-2006 [4106x2] | If you really need to insert at the beginning, you can either use list! or think of your block in reverse and consider appends to by inserts at the beginning - that is the way stacks are typically implemented. |
to by -> to be | |
Maxim 25-Apr-2006 [4108] | yess. |
BrianH 25-Apr-2006 [4109] | No point to preallocating a list! though, unless it is to make sure you will have enough ram :) |
Maxim 25-Apr-2006 [4110] | just for the record, I tried using lists, but in this stress test, they were quite memory intensive (compared to blocks) and eventually, the GC got slower at a rate quicker than the speed improvement I did notice in the begining. sooo as always, speed vs flexibility vs RAM still applies as always. |
BrianH 25-Apr-2006 [4111] | Well, the links in the list! type are overhead, and the chunks of memory taken up by list nodes are smaller than those taken up by blocks, leading to greater memory fragmentation. REBOL doesn't have a compacting collector. |
Maxim 25-Apr-2006 [4112] | I guess it at least impletment bottom and top allocation optimisation, based on chunk size? |
BrianH 25-Apr-2006 [4113x2] | On the other hand, list nodes are allocated one at a time rather than in groups, so if you have a lot of small lists they may take less ram than a lot of small blocks. I don't know how many cells are allocated when the runtime (re)allocates a block - I think it is powers of two up to multiples of 128. |
I may be wrong about that multiple value though - it might be 64. | |
Maxim 25-Apr-2006 [4115] | by my testing I'd say its 64, since creating 1 million empty blocks takes 64MBs. |
BrianH 25-Apr-2006 [4116] | No, that's the element size that causes that. Each block element has to hold a 64bit value (of various types) plus some typing overhead. Plus I would bet that every block has a one element prefix to store the block lengths. Plus, there is the overhead of your reference to the block which would be a value in another block. |
Maxim 25-Apr-2006 [4117] | right... I'm a bit tired ;-) |
BrianH 25-Apr-2006 [4118x3] | What I was talking about earlier was the allocation quanta. Blocks are allocated with the assumption that you will want to insert stuff into them without having to reallocate them every time. So by that powers of two to multiples of 128, when you want a block length 10, you get 16. When you want 129, you get 256, and so on. On that note, when you want 1000000, you would get 1000064. |
A list may be twice as big, but if you want memory predictability you can't beat it because the allocation quanta is 1 node and the insert/delete is O(1). Indexing and length calculations are O(n) though. | |
Later... | |
Maxim 25-Apr-2006 [4121] | ciao, thanks for all that info :-) |
Henrik 26-Apr-2006 [4122] | wouldn't it make sense for SKIP to support hex values? I'm trying to locate a specific position in a binary and it's tedious having to convert to number! every time. |
Volker 26-Apr-2006 [4123] | what do you convert to number? |
Gregg 26-Apr-2006 [4124] | What is a hex value in REBOL? |
Henrik 26-Apr-2006 [4125] | volker, a specific location in the binary taken from a hex editor |
Volker 26-Apr-2006 [4126x2] | but that conversion could be put in a little function by yourself? |
i think non-system-coders rarely use hex. | |
Maxim 26-Apr-2006 [4128] | skip can be redefined to something which supports hex numbers... |
Henrik 26-Apr-2006 [4129] | I just think it's too obvious and useful to leave out |
Gregg 26-Apr-2006 [4130] | But what is the proper representation of a hex value in REBOL? An issue? An ENBASEd string? |
Henrik 26-Apr-2006 [4131] | it's an issue |
Volker 26-Apr-2006 [4132] | why not write a converter and then skip bin &h #c4d9 |
Gregg 26-Apr-2006 [4133] | Ah, but an issue is really a string, not a number. The TO-HEX function makes it look like this is the recommended approach (it's what I would say, too), but I think it's a type mismatch (unfortunately). |
Henrik 26-Apr-2006 [4134] | I see |
Gregg 26-Apr-2006 [4135x3] | That is, I don't think standard functions should take issue! params for hex values unless RT says, officially, that it's how you do hex values in REBOL. |
Is hex notation something we should propose for R3? I'd bet money Carl thought about it long and hard in the original design, and may even have left some lexical wiggle-room to add it later. | |
propose = request. | |
Henrik 26-Apr-2006 [4138] | well, I can understand it if there is no official way to represent hex values. I think it should be proposed for R3 |
older newer | first last |