World: r3wp
[!REBOL3]
older newer | first last |
Geomol 20-May-2011 [8768] | Is the internal symbol table there to save memory? Maybe like Lua's internal structure to hold strings? |
BrianH 20-May-2011 [8769x3] | The internal symbol table is there to make symbols work at all. In R2, system/words was the symbol table. However, it does save memory relative to strings because there are no duplicates, and because the symbol data for the words is stored in UTF8 instead of 16-bit characters. |
They aren't added to any context until you add them explicitly. The R3 interpreter would not presume to know what a word would mean to you until you tell it what it means, by binding the word to a context or by just using the word as data. | |
bbl | |
Andreas 20-May-2011 [8772x3] | >> repeat i to-integer #7fffffffffffffff [to word! ajoin ["a" i]] ** Internal error: not enough memory ** Where: to repeat ** Near: to word! ajoin ["a" i] |
at ~1700MB resident | |
and it took 2h50m cpu time to get there :) | |
Geomol 20-May-2011 [8775] | :) Cool test! |
BrianH 20-May-2011 [8776] | This code might be a better test: repeat i to-integer #7fffffffffffffff [if zero? i // 1'000'000 [recycle] to-hex i] It should have less memory usage overall and if words are recycled then it won't run out. I'll run it now. |
Geomol 20-May-2011 [8777] | Where are the words coming into the picture? |
BrianH 20-May-2011 [8778] | TO-HEX generates an issue!, which is a word type in R3. Yes, you can even bind them. |
Geomol 20-May-2011 [8779] | Spooky! :) |
BrianH 20-May-2011 [8780] | I figure that not creating the temporary string, and running recycle every once in a while, might make the memory problems go away. So far the R3 process is staying at exactly the same memory, less than a gig. I also tossed an assignment in there so I can know which number it fails on. |
Andreas 20-May-2011 [8781x4] | the jump from ~900M to ~1.2G took ages |
then another aeon fuer 1.2G to 1.4G | |
and hours for 1.4G to 1.7G and fail | |
(jfyi) | |
BrianH 20-May-2011 [8785] | The time on mine won't be comparable because it's only running on one of 4 cores, and the others will be mildly occupied. |
Andreas 20-May-2011 [8786] | R3 won't utilise more than 1 core either way :) |
BrianH 20-May-2011 [8787x2] | Of course :) |
Ooo, tasks! But error handling is still broken in tasks, so this test won't work :( | |
Andreas 20-May-2011 [8789x3] | Ah, here's the jump :) |
From 1.0G to 1.9G, after 14m cpu time | |
But it's still running | |
BrianH 20-May-2011 [8792x2] | Mine just jumped up a hundred megs, still running :) |
It failed a lot earlier for me (some time in the last half hour), never getting over 1.2GB, and now there isn't enough memory to figure out what the number it failed at was. Looks like you run out of memory before you run into the word limit, and that the symbol table isn't cleaned up by the recycler. Good to know. | |
onetom 21-May-2011 [8794] | these experiments remind me how painful was it to figure out why are we getting those funky out of memory messages, saying something about PermGen in java |
Geomol 21-May-2011 [8795x2] | The internal symbol table is there to make symbols work at all. I don't think, I understand this fully. To me, it's like needing an internal string table to make strings work. Or an internal integer table to make integers work. Why not just have contexts, and don't put words in any context, if the word is just data? And if a word go from being data to hold more meaning (have a value attached), then put it in a context. |
If you take a book written in finnish, you see a lot of words, but they have no meaning to you. When you close the book, the finnish words shouldn't take up any space in your brain. | |
Gabriele 21-May-2011 [8797] | Geomol, I think you are confusing design with implementation. The implementation can be improved; however, it's a compromise between the complexity of it and how common your "millions of words" scenario is in practice. |
Geomol 21-May-2011 [8798] | Makes sense. |
Kaj 21-May-2011 [8799x4] | Putting words in a context is binding. That's very different from the symbol table, which you could say "binds" symbol strings to integer IDs |
If you can't read Finnish, it means the Finnish symbols are not bound to values in contexts in your head. Closing the book and erasing the symbol table is equivalent to quitting the REBOL proces, which does release the memory | |
I'm sure some Finnish words are also words in Danish, so forgetting the symbol table wouldn't help your situation :-) | |
I'm sure because my name is one of them... | |
Pekr 21-May-2011 [8803] | I prefer Finnish Vodka, and Finnish music :-) |
Ladislav 21-May-2011 [8804] | John, you are missing some things others know and find obvious. For example, do you know the answer to the following question? What is the ratio between stringn: func [n] [head insert/dup copy "" #"a" n] word: to word! stringn 1 t1: time-block [equal? word word] 0,05 word: to word! stringn 1000 t2: time-block [equal? word word] 0,05 t2 / t1 compared to string: stringn 1 t1: time-block [equal? string string] 0,05 string: stringn 1000 t2: time-block [equal? string string] 0,05 t2 / t1 ? |
onetom 21-May-2011 [8805x2] | time-block? u mean delta-time? |
(your naming is more intuitive though... it's also 'time in bash...) | |
Geomol 21-May-2011 [8807x2] | Thanks, Ladislav. Good examples! The thing, I missed, was that REBOL has this extra internal data structure to hold symbols (words), I though, just the contexts was used for that. So comparing words (that are not bound to any context) are much faster than comparing strings in REBOL. I see different possibilities, depending on implementation. If a word is changed to the result from a hash calculation, then two different words might give the same result, right? It's unlikely, but it could happen. That's why map datastructures are combined with lists, when two different hash calculations give same result. The other possibility is, that words are changed to pointers pointing to their entry in the map (hash table). Do you know, what of the two, REBOL implement? In other words, could two different words be equal in REBOL? About the strings, then it's possible to do kind of the same with hashing. Lua does that. If you have two different but identical strings (same string content) in Lua, they share the same memory area. Hashing is involved, and I guess, comparing string would be equal fast as comparing words, if REBOL did the same. (unless words are exchanged with the result from the hash calculation, but then two words might be equal) |
After a little testing, it seems, words are changed to a hash result, and REBOL gives an error, if two different words give same result: >> s: "abcdefghijklmn" >> forever [if equal? a: to word! random s b: to word! random s [print [a b]]] ** Internal Error: No more global variable space The error comes really fast. | |
Kaj 21-May-2011 [8809] | Petr, Nightwish? :-) |
Geomol 21-May-2011 [8810x3] | Above test is in R2. |
R3 keeps going and can't be stopped with <Esc>. | |
<Esc> + <Ctrl>-c seem to stop it. | |
Kaj 21-May-2011 [8813x2] | Global variable space should be the special object holding global values in R2. That's not the symbol table, either |
My guess is that it must do more useless binding than R3 | |
Geomol 21-May-2011 [8815] | You're right. to word! put it in system/words, I need to do to block!, I guess. |
Kaj 21-May-2011 [8816] | Maybe to-lit-word works |
Geomol 21-May-2011 [8817] | In R2: s: "abcdefghijklmn" forever [if equal? a: to block! random s b: to block! random s [print [a b]]] rebol-console: line 2: 5658 Floating point exception./rebol --noviewtop A crash. Interesting! :) |
older newer | first last |