r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3]

Geomol
20-May-2011
[8757]
REBOL use garbage collection, but in the case of counters, the same 
that would decrement a string counter. If a word stop pointing to 
it, decrement. If it's in a block, and the block is destroyed, decrement 
the block content.
BrianH
20-May-2011
[8758]
R3 doesn't have anything like R2's system/user. For all we know symbols 
could be garbage collected. In 32biit R3 though, afaik you will not 
reach the total number of possible words until you have already hit 
the limits of the memory address space first. Does someone have a 
computer with enough RAM to test this? I only have 4 GB.
Andreas
20-May-2011
[8759]
If you write the test, I can certainly run it :)
BrianH
20-May-2011
[8760x2]
for x 0 to-integer #7fffffffffffffff 1 [to-word ajoin ["a" x]]


Then watch the memory usage of the process, and tell us what happens 
and which error is triggered when it fails.
>> to-integer #7fffffffffffffff
== 9223372036854775807

That will be many more words than could be held in memory in a 32bit 
address space.
Geomol
20-May-2011
[8762]
R3 doesn't have anything like R2's system/user.


I don't think, anybody mentioned R2's system/user. Do you mean system/words 
?
BrianH
20-May-2011
[8763x2]
Sorry, yes, that's what I meant.
Though technically all R3 objects are more like system/words than 
they are like R2's objects.
Geomol
20-May-2011
[8765]
R3 has system/contexts/user , which seem to work like R2's system/words. 
Try

? system/contexts/user


before and after making some random words, e.g. in a block like [some 
random words]
BrianH
20-May-2011
[8766x2]
No, it really doesn't. All loaded words are added to system/words. 
Only words that are referenced directly in user scripts, or added 
explicitly with INTERN, are added to system/contexts/user. I had 
to add a bit of really careful code to make sure that system/contexts/user 
doesn't get words added to it until absolutely necessary, because 
it is an isolated context from the runtime library system/contexts/lib.
Symbols in R3 are stored in an internal symbols table, a btree(?) 
or some other unknown data structure that you can't reference externally.
Geomol
20-May-2011
[8768]
Is the internal symbol table there to save memory? Maybe like Lua's 
internal structure to hold strings?
BrianH
20-May-2011
[8769x3]
The internal symbol table is there to make symbols work at all. In 
R2, system/words was the symbol table. However, it does save memory 
relative to strings because there are no duplicates, and because 
the symbol data for the words is stored in UTF8 instead of 16-bit 
characters.
They aren't added to any context until you add them explicitly. The 
R3 interpreter would not presume to know what a word would mean to 
you until you tell it what it means, by binding the word to a context 
or by just using the word as data.
bbl
Andreas
20-May-2011
[8772x3]
>> repeat i to-integer #7fffffffffffffff [to word! ajoin ["a" i]]
** Internal error: not enough memory
** Where: to repeat
** Near: to word! ajoin ["a" i]
at ~1700MB resident
and it took 2h50m cpu time to get there :)
Geomol
20-May-2011
[8775]
:) Cool test!
BrianH
20-May-2011
[8776]
This code might be a better test: repeat i to-integer #7fffffffffffffff 
[if zero? i // 1'000'000 [recycle] to-hex i]

It should have less memory usage overall and if words are recycled 
then it won't run out. I'll run it now.
Geomol
20-May-2011
[8777]
Where are the words coming into the picture?
BrianH
20-May-2011
[8778]
TO-HEX generates an issue!, which is a word type in R3. Yes, you 
can even bind them.
Geomol
20-May-2011
[8779]
Spooky! :)
BrianH
20-May-2011
[8780]
I figure that not creating the temporary string, and running recycle 
every once in a while, might make the memory problems go away. So 
far the R3 process is staying at exactly the same memory, less than 
a gig. I also tossed an assignment in there so I can know which number 
it fails on.
Andreas
20-May-2011
[8781x4]
the jump from ~900M to ~1.2G took ages
then another aeon fuer 1.2G to 1.4G
and hours for 1.4G to 1.7G and fail
(jfyi)
BrianH
20-May-2011
[8785]
The time on mine won't be comparable because it's only running on 
one of 4 cores, and the others will be mildly occupied.
Andreas
20-May-2011
[8786]
R3 won't utilise more than 1 core either way :)
BrianH
20-May-2011
[8787x2]
Of course :)
Ooo, tasks! But error handling is still broken in tasks, so this 
test won't work :(
Andreas
20-May-2011
[8789x3]
Ah, here's the jump :)
From 1.0G to 1.9G, after 14m cpu time
But it's still running
BrianH
20-May-2011
[8792x2]
Mine just jumped up a hundred megs, still running :)
It failed a lot earlier for me (some time in the last half hour), 
never getting over 1.2GB, and now there isn't enough memory to figure 
out what the number it failed at was. Looks like you run out of memory 
before you run into the word limit, and that the symbol table isn't 
cleaned up by the recycler. Good to know.
onetom
21-May-2011
[8794]
these experiments remind me how painful was it to figure out why 
are we getting those funky out of memory messages, saying something 
about PermGen in java
Geomol
21-May-2011
[8795x2]
The internal symbol table is there to make symbols work at all.


I don't think, I understand this fully. To me, it's like needing 
an internal string table to make strings work. Or an internal integer 
table to make integers work. Why not just have contexts, and don't 
put words in any context, if the word is just data? And if a word 
go from being data to hold more meaning (have a value attached), 
then put it in a context.
If you take a book written in finnish, you see a lot of words, but 
they have no meaning to you. When you close the book, the finnish 
words shouldn't take up any space in your brain.
Gabriele
21-May-2011
[8797]
Geomol, I think you are confusing design with implementation. The 
implementation can be improved; however, it's a compromise between 
the complexity of it and how common your "millions of words" scenario 
is in practice.
Geomol
21-May-2011
[8798]
Makes sense.
Kaj
21-May-2011
[8799x4]
Putting words in a context

 is binding. That's very different from the symbol table, which you 
 could say "binds" symbol strings to integer IDs
If you can't read Finnish, it means the Finnish symbols are not bound 
to values in contexts in your head. Closing the book and erasing 
the symbol table is equivalent to quitting the REBOL proces, which 
does release the memory
I'm sure some Finnish words are also words in Danish, so forgetting 
the symbol table wouldn't help your situation :-)
I'm sure because my name is one of them...
Pekr
21-May-2011
[8803]
I prefer Finnish Vodka, and Finnish music :-)
Ladislav
21-May-2011
[8804]
John, you are missing some things others know and find obvious. For 
example, do you know the answer to the following question?

    What is the ratio between

        stringn: func [n] [head insert/dup copy "" #"a" n]


        word: to word! stringn 1 t1: time-block [equal? word word] 0,05

        word: to word! stringn 1000 t2: time-block [equal? word word] 0,05
        t2 / t1

compared to


        string: stringn 1 t1: time-block [equal? string string] 0,05

        string: stringn 1000 t2: time-block [equal? string string] 0,05
        t2 / t1
?
onetom
21-May-2011
[8805x2]
time-block? u mean delta-time?
(your naming is more intuitive though... it's also 'time in bash...)