World: r3wp
[!REBOL3]
older newer | first last |
Kaj 20-May-2011 [8747x2] | Yes |
John, you seem to want to break down the basic principles REBOL is built on | |
Geomol 20-May-2011 [8749] | I'm trying to figure out the basic principles to understand better, what REBOL is. What's good design and what isn't. And what the consequences are of different design. |
Maxim 20-May-2011 [8750] | john, you are in error when you say: "exhaust the global context" the number of words in the global context is irrelevant to exhausting the number of usable words in rebol. the reason is that binding is not what reserves words in master word table. its anything that creates a new word value, bound or not. here is an example, using your own to block! example: >> repeat i 100000 [to-block join "random-word" i] >> probe length? first system/words == 2616 pump up the number to 500000 (in 2.7.8) and it crashes. IIRC this was as low as 32k in older versions ! with each increment of 100000 you will see the rebol process gobble up a few MBs more as it assigns new word-ids to those words. |
Kaj 20-May-2011 [8751x2] | Yes, REBOL is symbolic, so there is an internal table of numeric IDs for every word it ever encountered in the session |
This is an indirection, but different from binding. To abolish that table would mean to keep strings everywhere internally instead of simple numbers. If you want a language to work that way, you should use shell scripting. It's very slow | |
Maxim 20-May-2011 [8753] | yep |
Geomol 20-May-2011 [8754] | I think, there is a third alternative. When we deal with strings, a data structure is made, and we just have a pointer to that. var: "a string" var: none When noone is using the string anymore, memory can be completely cleaned for it. If I do the same with a word: var: 'word var: none I don't see, why memory can't be cleaned just the same. Is it a design flaw, the way it is? |
onetom 20-May-2011 [8755x2] | which suggest using some kind of reference counter for the words, but what would decrement such a reference counter? |
*suggests | |
Geomol 20-May-2011 [8757] | REBOL use garbage collection, but in the case of counters, the same that would decrement a string counter. If a word stop pointing to it, decrement. If it's in a block, and the block is destroyed, decrement the block content. |
BrianH 20-May-2011 [8758] | R3 doesn't have anything like R2's system/user. For all we know symbols could be garbage collected. In 32biit R3 though, afaik you will not reach the total number of possible words until you have already hit the limits of the memory address space first. Does someone have a computer with enough RAM to test this? I only have 4 GB. |
Andreas 20-May-2011 [8759] | If you write the test, I can certainly run it :) |
BrianH 20-May-2011 [8760x2] | for x 0 to-integer #7fffffffffffffff 1 [to-word ajoin ["a" x]] Then watch the memory usage of the process, and tell us what happens and which error is triggered when it fails. |
>> to-integer #7fffffffffffffff == 9223372036854775807 That will be many more words than could be held in memory in a 32bit address space. | |
Geomol 20-May-2011 [8762] | R3 doesn't have anything like R2's system/user. I don't think, anybody mentioned R2's system/user. Do you mean system/words ? |
BrianH 20-May-2011 [8763x2] | Sorry, yes, that's what I meant. |
Though technically all R3 objects are more like system/words than they are like R2's objects. | |
Geomol 20-May-2011 [8765] | R3 has system/contexts/user , which seem to work like R2's system/words. Try ? system/contexts/user before and after making some random words, e.g. in a block like [some random words] |
BrianH 20-May-2011 [8766x2] | No, it really doesn't. All loaded words are added to system/words. Only words that are referenced directly in user scripts, or added explicitly with INTERN, are added to system/contexts/user. I had to add a bit of really careful code to make sure that system/contexts/user doesn't get words added to it until absolutely necessary, because it is an isolated context from the runtime library system/contexts/lib. |
Symbols in R3 are stored in an internal symbols table, a btree(?) or some other unknown data structure that you can't reference externally. | |
Geomol 20-May-2011 [8768] | Is the internal symbol table there to save memory? Maybe like Lua's internal structure to hold strings? |
BrianH 20-May-2011 [8769x3] | The internal symbol table is there to make symbols work at all. In R2, system/words was the symbol table. However, it does save memory relative to strings because there are no duplicates, and because the symbol data for the words is stored in UTF8 instead of 16-bit characters. |
They aren't added to any context until you add them explicitly. The R3 interpreter would not presume to know what a word would mean to you until you tell it what it means, by binding the word to a context or by just using the word as data. | |
bbl | |
Andreas 20-May-2011 [8772x3] | >> repeat i to-integer #7fffffffffffffff [to word! ajoin ["a" i]] ** Internal error: not enough memory ** Where: to repeat ** Near: to word! ajoin ["a" i] |
at ~1700MB resident | |
and it took 2h50m cpu time to get there :) | |
Geomol 20-May-2011 [8775] | :) Cool test! |
BrianH 20-May-2011 [8776] | This code might be a better test: repeat i to-integer #7fffffffffffffff [if zero? i // 1'000'000 [recycle] to-hex i] It should have less memory usage overall and if words are recycled then it won't run out. I'll run it now. |
Geomol 20-May-2011 [8777] | Where are the words coming into the picture? |
BrianH 20-May-2011 [8778] | TO-HEX generates an issue!, which is a word type in R3. Yes, you can even bind them. |
Geomol 20-May-2011 [8779] | Spooky! :) |
BrianH 20-May-2011 [8780] | I figure that not creating the temporary string, and running recycle every once in a while, might make the memory problems go away. So far the R3 process is staying at exactly the same memory, less than a gig. I also tossed an assignment in there so I can know which number it fails on. |
Andreas 20-May-2011 [8781x4] | the jump from ~900M to ~1.2G took ages |
then another aeon fuer 1.2G to 1.4G | |
and hours for 1.4G to 1.7G and fail | |
(jfyi) | |
BrianH 20-May-2011 [8785] | The time on mine won't be comparable because it's only running on one of 4 cores, and the others will be mildly occupied. |
Andreas 20-May-2011 [8786] | R3 won't utilise more than 1 core either way :) |
BrianH 20-May-2011 [8787x2] | Of course :) |
Ooo, tasks! But error handling is still broken in tasks, so this test won't work :( | |
Andreas 20-May-2011 [8789x3] | Ah, here's the jump :) |
From 1.0G to 1.9G, after 14m cpu time | |
But it's still running | |
BrianH 20-May-2011 [8792x2] | Mine just jumped up a hundred megs, still running :) |
It failed a lot earlier for me (some time in the last half hour), never getting over 1.2GB, and now there isn't enough memory to figure out what the number it failed at was. Looks like you run out of memory before you run into the word limit, and that the symbol table isn't cleaned up by the recycler. Good to know. | |
onetom 21-May-2011 [8794] | these experiments remind me how painful was it to figure out why are we getting those funky out of memory messages, saying something about PermGen in java |
Geomol 21-May-2011 [8795x2] | The internal symbol table is there to make symbols work at all. I don't think, I understand this fully. To me, it's like needing an internal string table to make strings work. Or an internal integer table to make integers work. Why not just have contexts, and don't put words in any context, if the word is just data? And if a word go from being data to hold more meaning (have a value attached), then put it in a context. |
If you take a book written in finnish, you see a lot of words, but they have no meaning to you. When you close the book, the finnish words shouldn't take up any space in your brain. | |
older newer | first last |