World: r3wp
[!REBOL3]
older newer | first last |
Pekr 13-Dec-2010 [6549] | Why do guys need so large map array? Don't you remember Bill Gates once said, that 640KB is enough for everyone? :-) |
Maxim 13-Dec-2010 [6550x3] | this seems to prove that pre-allocating maps does indeed store all the space required: >> stats == 921576 >> m: make map! 20000000 == make map! [ ] >> stats == 909353688 |
Jerry, you might want to try the above and see if the wait occurs on the 21 millionth item. | |
(20 millionth item + 1) | |
Andreas 13-Dec-2010 [6553] | I'm with Brian in that most likely the only thing "not optimal" is the amount of RAM in Jerry's system. |
Maxim 13-Dec-2010 [6554x2] | yep! |
if the pre-allocation fixes the setup. then there's no bug in REBOL. | |
Andreas 13-Dec-2010 [6556] | Preallocation won't help with insufficient memory :) |
Pekr 13-Dec-2010 [6557] | moving to SSD disks might help a bit :-) |
BrianH 13-Dec-2010 [6558] | Well, it would, actually. It would still be slow to use, but not as slow. Reallocation takes even more memory. |
Andreas 13-Dec-2010 [6559] | Yes, of course. |
BrianH 13-Dec-2010 [6560] | This is assuming virtual memory. I wouldn't even be able to test on this system; it only has 1GB of RAM. My main system would work though. |
Andreas 13-Dec-2010 [6561x3] | I was hinting at the fact that this probably won't matter much if we are talking about a 512M system :) |
>> dt [m: make map! [] repeat i to-integer 2 ** 24 [poke m i i]] == 0:01:22.254896 ;; 2393MB resident | |
>> dt [m: make map! n: to-integer 2 ** 24 repeat i n [poke m i i]] == 0:00:48.329933 ;; 1026MB resident | |
BrianH 13-Dec-2010 [6564] | I suggested preallocation in a comment. You might want to chime in with that code :) |
Andreas 13-Dec-2010 [6565] | I would probably make it significantly large than the number of expected pairs. |
BrianH 13-Dec-2010 [6566] | Round up :) |
Andreas 13-Dec-2010 [6567x3] | ;; Storing 2^24 in a map with 2^25 preallocated >> dt [m: make map! to-integer 2 ** 25 repeat i to-integer 2 ** 24 [poke m i i]] == 0:00:33.695578 ;; 1538MB resident |
~30% faster. | |
Still incredibly slow. | |
BrianH 13-Dec-2010 [6570] | Are you doing those measurements in a fresh console each time? Remember, process allocation from the OS matters too. |
Andreas 13-Dec-2010 [6571] | Yes :) |
BrianH 13-Dec-2010 [6572] | Figured :) |
Andreas 13-Dec-2010 [6573x6] | With speedstepping switched off and the REBOL process pinned to a single core. |
And ... | |
What antagonist language of the day do we want to counter-benchmark? | |
Let's go with Python. | |
~10 times faster (without pre-allocation). Well ... | |
Guess there's some room for improvement :) | |
BrianH 13-Dec-2010 [6579] | How fast without the repeat, but with preallocation, comparing Python and R3? Remember, Python is compiled. |
Andreas 13-Dec-2010 [6580] | Bytecode compiled, with a really slow VM. |
BrianH 13-Dec-2010 [6581] | Bytecode compiled is still compiled, and it is likely that bytecodes specific to loops are being used. I am interested in the comparison of the preallocation of the map! type versus the Python equivalent, whatever that is. |
Andreas 13-Dec-2010 [6582] | I don't know of a way to preallocate Python dicts. |
BrianH 13-Dec-2010 [6583] | Python dicts are probably not allocated in a large chunk. |
Andreas 13-Dec-2010 [6584x3] | Well, afair they use open addressing in their implementation, so I guess they will. |
But then, no idea about the impl details. | |
As an aside, path notation (`m/(i): i`) is slightly (6%) faster than POKE in this case. | |
BrianH 13-Dec-2010 [6587x2] | POKE has to look up the function twice: Once from the word, the nect time in the action's datatype. Path evaluation at least knows what it's doing, so it doesn't have to figure it out. |
the next time in the datatype's action list. | |
Jerry 13-Dec-2010 [6589] | There must be some algorithm issue in R3 map!. When I have 21,000,000 key-value pairs in a map, accessing it becomes very slow. Using " mymap/:key " to get a value takes 0.2 sec. |
Andreas 13-Dec-2010 [6590] | Did you read the notes suggesting that you might be running out of physical memory (RAM)? |
Jerry 13-Dec-2010 [6591] | I hope the lack of enough physical memory is the reason. I have 2GB RAM in my PC. I will get my MacBook Pro this evening. It will have 8GB RAM. I will test this in my Mac. |
Andreas 13-Dec-2010 [6592x2] | Well, 2GB will be a close call for 21M entries. |
Have a look at your memory/swap consumption, that'll probably help you identify if that's a problem. | |
Jerry 13-Dec-2010 [6594] | Yeah, I hope we will have 64-bit REBOL for Mac soon. Analizing social networking data without enough physical memory is a pain. |
BrianH 13-Dec-2010 [6595] | Particularly on OSX (or Windows 7/Vista), since the OS itself uses a lot of RAM. |
Andreas 13-Dec-2010 [6596x3] | Any serious data handling with 32b is a pain :) |
Hm, but yes. There might actually be something seriously off about >>2^24 entries. | |
Initialising a map with 21M entries just took insanely long for me. Investigating. | |
older newer | first last |