r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Core] Discuss core issues

Graham
19-Apr-2007
[7662]
Ashley's rebdb is an in memory db
Sunanda
19-Apr-2007
[7663]
That's *unique* words, not total words used

You can have as many aaa's as you like in different contexts; it 
adds only 1 to system/words
Terry
19-Apr-2007
[7664x3]
yeah.. i just need a simple hash table
but i like the  aaa/bbb/ccc  path syntax
a dictionary , for example..   cat ["def of cat" ] dog ["def of dog"]
run out of words real quick
Sunanda
19-Apr-2007
[7667]
But you may need to use strings or some other form of identifier 
-- REBOL words are rationed.
Terry
19-Apr-2007
[7668x2]
yeah
but then you lose the path syntax.
Sunanda
19-Apr-2007
[7670]
Use ordinals, then you can  go up to maximum integer with path syntax:
  database/1/234/4454/655/3

Slight problem there: when you delete things you have to leave empty 
slots or change all numbers.
Terry
19-Apr-2007
[7671]
given they would act as keys.. that would be more than a 'slight' 
problem
Sunanda
19-Apr-2007
[7672]
There's always a design trade-off.

I have various systems where real keys (usually strings) are mapped 
to surrogate keys (usually integers). It's a good compromise in many 
cases.
Terry
19-Apr-2007
[7673]
What is THE best way to store a large hash table in memory  like 
a dictionary, .. and access the data?
Henrik
19-Apr-2007
[7674]
terry, about accessing data, there is a speed issue with doing so 
with using indexes with paths, so data/2 is much slower than second 
data or pick data 2. that should solve at least that aspect.
Sunanda
19-Apr-2007
[7675]
Not sure there is one best way.....I have a structure that has 158,000 
keys indexing 120,000 documents. All in  memory. 
The index (when loaded) is nearly 6meg.

The code uses nested tables, hashes, surrogate keys and a couple 
of other tricks devised for the purpose.

It's worth playing with approaches. You can normally expect a  ten 
fold improvement between initial ideas and later improvements.
Terry
19-Apr-2007
[7676x5]
yeah, i see that just pokin around with some data now..
I take it your keys are string?
Just thinknig that even Carl's tiny web server, that loads the www 
content into a hash table would be a very fast server
Don't even need mod-rewrites.. just intercept the GET request and 
take it from there
I've been working with Apache and PHP quite a bit, and it just feels 
really bloated.
Oldes
19-Apr-2007
[7681]
Is there any reason why to write for example: [ b: make block! 100 
] instead of just  [ b: copy [] ] ?
Gabriele
19-Apr-2007
[7682]
make block! 100 means that you can insert at least 98 values before 
a reallocation is needed
Sunanda
19-Apr-2007
[7683]
Whereas.

copy [] reserves some other number (you can find out, more-or-less, 
by close observation of stats)
Gabriele
19-Apr-2007
[7684x3]
(the difference is an implementation quirk)
i mean the diff between 100 and 98
to say it another way, as long as length? b is less than 98, insert 
tail b is guaranteed to be O(1)
Oldes
19-Apr-2007
[7687]
it looks, that the copy is faster a little bit, at least if I don't 
know how big the block can be, so it's probably better to use copy 
[] (if I don't know precise size):

>> tm 1000 [b: make block! 20000 loop 5 [insert/dup tail b 0 1000]]
0:00:00.907
>> tm 1000 [b: make block! 200 loop 5 [insert/dup tail b 0 1000]]
0:00:00.453
>> tm 1000 [b: copy [] loop 5 [insert/dup tail b 0 1000]]
0:00:00.437

>> tm 1000 [b: make block! 5002 loop 5 [insert/dup tail b 0 1000]]
0:00:00.343
Geomol
19-Apr-2007
[7688]
Try do it several times and take the average.
Henrik
19-Apr-2007
[7689]
perhaps also make block! is easier to garbage collect?
Graham
20-Apr-2007
[7690]
Is there a rebol based wiki that has versioning?
Rebolek
20-Apr-2007
[7691]
This seems at least inconsistent to me:

>> to integer! none
== 0
>> to decimal! none
** Script Error: Invalid argument: none
** Near: to decimal! none
Henrik
20-Apr-2007
[7692]
also happens with empty strings
Gabriele
20-Apr-2007
[7693x2]
rebolek: i believe that is on rambo already. if not, feel free to 
add it.
oldes: i would guess that make block! n also clears the allocated 
memory, so it may be slower than a reallocation if you are using 
something like insert/dup etc. i think that when reallocating, the 
size is always doubled, up to a limit after which it is increased 
linearly.
Rebolek
20-Apr-2007
[7695]
Gabriele, you're right, it's there: Ticket #4162
Maxim
20-Apr-2007
[7696x2]
I have a deep question, when we use clear, does it clear the pre-allocated 
space (shrinking back a series) or does it only set the series termination 
to the specified index?
both are usefull, I just wonder which one is done internally?
Gregg
20-Apr-2007
[7698]
I believe it keeps the allocated space, so it's more efficient to 
reuse a series, rather than always starting a new one with copy.
Maxim
20-Apr-2007
[7699]
That's what I believe... I've just always had that tingling "What 
if"
Ladislav
20-Apr-2007
[7700]
Max: you can easily check:
>> a: make string! 1100
== ""
>> string-address? a
== 17267856
>> string-address? clear a
== 17267856
Maxim
20-Apr-2007
[7701]
string-address?  what is that?  Ladislav - you amaze me sometimes 
with your memory and knowledge of ALL rebol funcs... I never even 
saw thar arcane magic!
Gregg
20-Apr-2007
[7702]
I think it's based on his peek-addr stuff.
Maxim
20-Apr-2007
[7703]
oh, that is an internal ladislav mastery script
Ladislav
20-Apr-2007
[7704x2]
I mentioned the script recently on the ML: http://www.fm.tul.cz/~ladislav/rebol/peekpoke.r
(but it can be called arcane magic, I guess :-)
Maxim
20-Apr-2007
[7706]
;-)
Ladislav
20-Apr-2007
[7707x2]
>> a: "this is a string"
== "this is a string"
>> adr: string-address? a
== 17496432
>> len: length? a
== 16
>> clear a
== ""
>> as-string memory? adr len
== "^@his is a string"
;-)
Maxim
20-Apr-2007
[7709]
this is quite clean :-)  hum for debugging .
Ladislav
20-Apr-2007
[7710]
Anton used the same words, IIRC
xavier
22-Apr-2007
[7711]
very powerfull.  i learned something more today :)