World: r3wp
[!REBOL3]
older newer | first last |
Pekr 13-Dec-2010 [6557] | moving to SSD disks might help a bit :-) |
BrianH 13-Dec-2010 [6558] | Well, it would, actually. It would still be slow to use, but not as slow. Reallocation takes even more memory. |
Andreas 13-Dec-2010 [6559] | Yes, of course. |
BrianH 13-Dec-2010 [6560] | This is assuming virtual memory. I wouldn't even be able to test on this system; it only has 1GB of RAM. My main system would work though. |
Andreas 13-Dec-2010 [6561x3] | I was hinting at the fact that this probably won't matter much if we are talking about a 512M system :) |
>> dt [m: make map! [] repeat i to-integer 2 ** 24 [poke m i i]] == 0:01:22.254896 ;; 2393MB resident | |
>> dt [m: make map! n: to-integer 2 ** 24 repeat i n [poke m i i]] == 0:00:48.329933 ;; 1026MB resident | |
BrianH 13-Dec-2010 [6564] | I suggested preallocation in a comment. You might want to chime in with that code :) |
Andreas 13-Dec-2010 [6565] | I would probably make it significantly large than the number of expected pairs. |
BrianH 13-Dec-2010 [6566] | Round up :) |
Andreas 13-Dec-2010 [6567x3] | ;; Storing 2^24 in a map with 2^25 preallocated >> dt [m: make map! to-integer 2 ** 25 repeat i to-integer 2 ** 24 [poke m i i]] == 0:00:33.695578 ;; 1538MB resident |
~30% faster. | |
Still incredibly slow. | |
BrianH 13-Dec-2010 [6570] | Are you doing those measurements in a fresh console each time? Remember, process allocation from the OS matters too. |
Andreas 13-Dec-2010 [6571] | Yes :) |
BrianH 13-Dec-2010 [6572] | Figured :) |
Andreas 13-Dec-2010 [6573x6] | With speedstepping switched off and the REBOL process pinned to a single core. |
And ... | |
What antagonist language of the day do we want to counter-benchmark? | |
Let's go with Python. | |
~10 times faster (without pre-allocation). Well ... | |
Guess there's some room for improvement :) | |
BrianH 13-Dec-2010 [6579] | How fast without the repeat, but with preallocation, comparing Python and R3? Remember, Python is compiled. |
Andreas 13-Dec-2010 [6580] | Bytecode compiled, with a really slow VM. |
BrianH 13-Dec-2010 [6581] | Bytecode compiled is still compiled, and it is likely that bytecodes specific to loops are being used. I am interested in the comparison of the preallocation of the map! type versus the Python equivalent, whatever that is. |
Andreas 13-Dec-2010 [6582] | I don't know of a way to preallocate Python dicts. |
BrianH 13-Dec-2010 [6583] | Python dicts are probably not allocated in a large chunk. |
Andreas 13-Dec-2010 [6584x3] | Well, afair they use open addressing in their implementation, so I guess they will. |
But then, no idea about the impl details. | |
As an aside, path notation (`m/(i): i`) is slightly (6%) faster than POKE in this case. | |
BrianH 13-Dec-2010 [6587x2] | POKE has to look up the function twice: Once from the word, the nect time in the action's datatype. Path evaluation at least knows what it's doing, so it doesn't have to figure it out. |
the next time in the datatype's action list. | |
Jerry 13-Dec-2010 [6589] | There must be some algorithm issue in R3 map!. When I have 21,000,000 key-value pairs in a map, accessing it becomes very slow. Using " mymap/:key " to get a value takes 0.2 sec. |
Andreas 13-Dec-2010 [6590] | Did you read the notes suggesting that you might be running out of physical memory (RAM)? |
Jerry 13-Dec-2010 [6591] | I hope the lack of enough physical memory is the reason. I have 2GB RAM in my PC. I will get my MacBook Pro this evening. It will have 8GB RAM. I will test this in my Mac. |
Andreas 13-Dec-2010 [6592x2] | Well, 2GB will be a close call for 21M entries. |
Have a look at your memory/swap consumption, that'll probably help you identify if that's a problem. | |
Jerry 13-Dec-2010 [6594] | Yeah, I hope we will have 64-bit REBOL for Mac soon. Analizing social networking data without enough physical memory is a pain. |
BrianH 13-Dec-2010 [6595] | Particularly on OSX (or Windows 7/Vista), since the OS itself uses a lot of RAM. |
Andreas 13-Dec-2010 [6596x3] | Any serious data handling with 32b is a pain :) |
Hm, but yes. There might actually be something seriously off about >>2^24 entries. | |
Initialising a map with 21M entries just took insanely long for me. Investigating. | |
Jerry 13-Dec-2010 [6599] | Andreas, would you post a ticket in CC on this? You probably can describe the issue better than me. |
Andreas 13-Dec-2010 [6600] | If I can pin down an issue, I will. |
BrianH 13-Dec-2010 [6601] | On my system it had to allocate virtual memory for the process from the OS, and swap memory in RAM to the VM so it would have room to allocate the map in the working RAM. It took as long as I would have expected it to take given that circumstance. |
Andreas 13-Dec-2010 [6602] | As soon as you start swapping, all bets are off. |
BrianH 13-Dec-2010 [6603x2] | An empty map! of 22,000,000 entries took nearly 1GB of RAM on its own, and that doesn't include memory for any strings, blocks or structures that you might add to the map after it is allocated. |
I used DP instead of DT, and it gave me all the details. | |
Andreas 13-Dec-2010 [6605x2] | Can't tell you about the nature of your virtual memory, though. |
Well, it technically could, but it doesn't :) | |
older newer | first last |