r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Core] Discuss core issues

Terry
20-May-2007
[8057x2]
so.. is it fair to conclude that integers take up the same amout 
of memory as strings?
>> a: 1
== 1
>> to-binary a
== #{31}
>> a
== 1
>> a: "1"
== "1"
>> to-binary a
== #{31}
>> a
== "1"
Sunanda
20-May-2007
[8059]
It may be fairer and more accurate to say REBOL has some subtle optimizations 
in its memory allocation.

Watch, for example, the memory drop here.....you'll see that as the 
block 'x grows, REBOL allocates space in anticipation, so a single 
insert into 'x does not always change the stats.
x: copy []
st: system/stats
loop 500 [
    print [(st - system/stats) length? x]
    append x length? x
    recycle
   ]
Terry
20-May-2007
[8060x3]
>>r: does [recycle set 'a [#{7FFFFFFF}] system/stats]
>> r
== 4559204

>> r: does [recycle set 'a [6442450943] system/stats]
>> r
== 4562852
the first is the binary version of the second integer..  safe to 
say that a binary uses more mem than the integer?
Im trying to determine the most memory efficient method of storing 
a large block of numbers (can be integers, bits.. i dont care, as 
long as i can convert it to an integer in the end)

Also cosidering the best storage for finding a particular binary 
or integer in the block?
Anton
20-May-2007
[8063]
A binary does use more than an integer, but the above doesn't prove 
it. You're only checking one value. As Sunanda wrote, rebol's memory 
allocations are not obvious. It uses pools of memory, which allows 
reuse of memory.
Terry
20-May-2007
[8064]
why would a binary use more mem.. shouldn't it be the other way around?
Sunanda
20-May-2007
[8065x2]
Gabriele has explained it previously, something like this:
Values life in "value slots". Words point to value slots.

Some values live entirely in their value slot -- chars, integers: 
ie the short ones with a determinate maximum length.

Other values live in memory pointed to the by value slot -- such 
as strings.
There is memory allocation optimisation both for value slots (as 
seen in the growing block of integers example above), and elsewhere.

So a single allocation is not enough to deriive the underlying algorithm.
Anton
20-May-2007
[8067]
And a binary is just a type of string, so yes, the value slot contains 
a pointer to the actual data.
Sunanda
20-May-2007
[8068]
Value slots are always 16 bytes long in current REBOL versions (Says 
Gabriel).

It seems reasonable to assume this will increase if REBOL goes 64bit 
(speculates Sunanda)
Terry
20-May-2007
[8069]
so then a block of integers  ie: [1234432   345   45345   5435   
2345  5435353]  .. .
is the most efficient way to store?
Anton
20-May-2007
[8070]
Storing a bunch of integers in a binary should be more efficient, 
only one value slot used, and each integer takes only 32bits.
Terry
20-May-2007
[8071]
I guess that's my point.. i need to use 32 bits to store a single 
integer??
Anton
20-May-2007
[8072x2]
What precision do you need in your integers ?
You could make 8-bit or 16-bit, or ... 13-bit integers if you wanted. 
It's more work, but possible.
Terry
20-May-2007
[8074]
i can use bits.. where 00= 0   01 = 1  10 = 2 .. 11 = 3  etc.
Anton
20-May-2007
[8075]
Yes, how many bits per integer are needed ? What's the highest number 
? Any negative numbers allowed ?
Terry
20-May-2007
[8076]
( 0 = 0 rather)
Anton
20-May-2007
[8077]
(ie. how many unique integer values are needed ?)
Terry
20-May-2007
[8078]
no negatives..  and a MAX of 32 bits is more than enough for the 
largest number
Sunanda
20-May-2007
[8079]
A value slot is 16 bytes, so  a single integer takes 16 bytes -- 
it has to live in a value slot.
Terry
20-May-2007
[8080x2]
16 bytes.. that seems large
I guess that's where the number of words limitation kicks in?
Anton
20-May-2007
[8082]
No, the slot size is fixed so it can fit a whole lot of different 
types in it, eg. time! values.
Sunanda
20-May-2007
[8083]
word limit is something else -- to do with the number of unique (across 
all contexts) words. Not related to value slots as far as I know.
Terry
20-May-2007
[8084]
Im thinking I should do this in Assembly  ;)
Anton
20-May-2007
[8085]
Well, what is your largest number ?
Terry
20-May-2007
[8086]
ahh .. ok..  that's a fairly heavy price to pay (relatively speaking) 
just so I can store non binary data types
Anton
20-May-2007
[8087]
That dictates the number of bits needed to represent all your numbers.
Sunanda
20-May-2007
[8088]
For compact representation of large integer sets, I often uses the 
format:
    [10x4 50 76x1000] 
ir REBOL pairs! (each taking 1 value slot) meaning
     [10 11 12 13  50  76 77 78 ..... 1076]
This works for me for large blocks of semi-sparse integers.

There is a script for handling this format at REBOL.org -- look for 
rse-ids.r
Terry
20-May-2007
[8089]
Im guessing my largest number is probably around 2 million... approx
Anton
20-May-2007
[8090]
You have to know what your largest number is. Otherwise your software 
will break with overflow errors.
Terry
20-May-2007
[8091]
well..  say 21 bits
Anton
20-May-2007
[8092x3]
Well 24bits (3 bytes) --> 2^24 = 16777216 unique values.
Ok, so you could pack each 21 (or 24) bits into a binary.
No wastage.
Terry
20-May-2007
[8095]
so is that a block of binaries then?
Anton
20-May-2007
[8096]
No, a single binary.
Terry
20-May-2007
[8097]
yeah for each... but i need to group them
Anton
20-May-2007
[8098x3]
You have to write the access methods to index into your binary though.
(So for that I recommend just using 3 bytes.)
Ok, well you can have several binaries, why not ?
Sunanda
20-May-2007
[8101x2]
Anton, I think, is suggesting you do something like this:
my-list: make binary! 25000

Then use insert or append to store 3-byte values to the binary string 
you have created.

That way you need only 1 value slot plus the length of the binary 
string.
oops -- anton typed faster than me :-)
Terry
20-May-2007
[8103]
ahh ok..    ie: [a4b   4°¢  ÑAg]   etc ?
Sunanda
20-May-2007
[8104]
That would be a block of binary....each binar would take a slot
Anton is suggesting this sort of approach:
x: make binary! 25000
>> loop 18 [insert x to-char random 255]
== #{5E218289FC8B65B86A1C597C232F9E8C79}
Just one binary string to hold the data.
Terry
20-May-2007
[8105x2]
is there a function to convert an integer to 3 bytes?
ahh.. nice