r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Core] Discuss core issues

Maxim
13-Dec-2009
[15218]
(oops  'strings and pairs'  >  'string and *integers*' )
Janko
13-Dec-2009
[15219x3]
Maxim .. thanks a lot for your answers.. very interesting .. I know 
from distance how hashtables work internally but I don't know details.. 
should a block take roughly the same space as hashtable of the same 
block (in rebol) or factor(s) different?
hm.. does stats return ram used??
(it does, cool :) I was looking at processes to see how much it will 
eat)
Maxim
13-Dec-2009
[15222]
hum... lets see:  ;-)

a: stats
b: make block! 5000010
print stats - a
== 80001039

a: stats
b: make hash! 5000010
print stats - a
== 80005071
Janko
13-Dec-2009
[15223x2]
>> a: stats b: make block! 1000 repeat i 1000 [ append b random "abcdef" 
random 100000 ] print stats - a
48671

>> a: stats b: make hash! 1000 repeat i 1000 [ append b random "abcdef" 
random 100000 ] print stats - a
81454
:)
Maxim
13-Dec-2009
[15225]
but... filled up....

b: make hash! 5000010
m: stats

loop 5000000 [append b copy random "1234567890" append b random 10000000]
print stats - m
== 188430448

here its half the space.


a ha!  depending on the string input... hash tables can actually 
be smaller...   :-)
Janko
13-Dec-2009
[15226]
stats is a cool command , with many refinements also .. I didn't 
know about it
Maxim
13-Dec-2009
[15227]
in REBOL, we're a newbie a few minutes... every day.... even after 
a decade of using it   ;-)
Janko
13-Dec-2009
[15228]
I am nevbie a little longer each day :)
Maxim
13-Dec-2009
[15229]
hehe
Janko
13-Dec-2009
[15230x2]
aha, I see that it depends .. I increased the length of string and 
block increased in size while hash stayed the same
hm.. I have a very newbie question .. do you most effectively add 
new pairs to hashtable by appending to it as a block ? can't figure 
out how to change a value .. set doesn't work that way
Maxim
13-Dec-2009
[15232x4]
yep.
append works on hash tables.  in fact they are exactly the same as 
if you where using blocks, except that the internal representation 
is different than what you look at through code.
a: make hash! [ "33" 33 "44" 44 "55" 55]
select a "33"
change find a "44" ["88" 88]
== make hash! ["33" 33 "88" 88 "55" 55]
but janko... if you test it, you will that hash tables are   extremely 
faster at retrieving data...  the larger the set the bigger the difference. 
 

on millions of records indexed with strings , it could be hundreds 
or thousands of times faster  :-)
Graham
13-Dec-2009
[15236]
Von, I think I just mean that your password for emstp will have to 
be in the script ( if it is needed .. )
Janko
13-Dec-2009
[15237]
Maxim: yes, I am aware that retrieving data from hashtables is really 
fast... I wasn't aware it will just as fast even with 1M records 
so I was quite amazed before when I tried it
Pavel
14-Dec-2009
[15238x2]
Transfering memory based hash! (map! in R3) datatype into disk based 
shema automatically keeping the hash table computation and lookup 
hidden from user gives you a RIF. Holly grail of all rebollers :) 
long long time promissed, still waiting to be done. Anyway hash tables 
are always usually unsorted, when necessary to search in usually 
some type of additional index is used (B-tree for example), for simple 
information if the key is in the set, bitmap vectors are used with 
advantage, when the set is really big (and bitmap vector doesn fit 
into memory) comressed bitmap may be used and usually bitwise operations 
on those vectors are much quicker than on uncompressed. 

Thisi is why it should be used for bitset! datatype anyway. The number 
of byte aligned (BBC,Packbit,RLE)od word aligned (WAH) schemes exists. 
 It is used in very large datasets when index also resides in disk 
file. Once again bitwise operation may be much quickier even in memory 
on those schemes.
For those interrested a Fastbit webpage is good source of docs.
Maxim
14-Dec-2009
[15240x2]
when map! will added to extensions, you might be able implement an 
example for us and Carl might consider adding your code directly 
in the host or r3lib if you agree to it.   :-)
you seem to be already knowledged about this, so you'd be the best 
one to implement it IMHO (pavel).
Pavel
15-Dec-2009
[15242]
I'd glad to try, but internals are quite well hidden now. Anyway 
any hint about handle or crossreferencing from extension you have 
found Maxim?
Maxim
15-Dec-2009
[15243]
you mean calling code from the host within extensions?
Pavel
15-Dec-2009
[15244]
yes I've understand your anouncement this way
Maxim
15-Dec-2009
[15245x4]
I will be rebuilding the callback example with a much better/simpler 
design. but they work very well, basically I have mapped the Reb_Do_String() 
and Reb_Print() functions so that they can be called from within 
any extension.
I am also building little helper funcs like a REBOL datatype centric 
version of sprintf  which acts a bit like a C-side rejoin for REBOL.
this way we can create rebol code directly from strings and native 
data very easily.


there is currently a size limit on executed strings, its a simple 
question of optimisation.  this means we can't use the wiredf function 
for creating large datasets via strings (for now).

but I'm already doing stuff like:


wiredf("rogl-event-handler make wr-event [new-size: %p]", win-w, 
win-h);


calls rebol's do with %p replaced by a pair, using 2 ints.  this 
is a varargs function.
(the callback framework is currently called wired)
Pavel
16-Dec-2009
[15249]
Thanks for info Maxim.
Gabriele
18-Dec-2009
[15250x6]
i was just thinking again about the idea of IF (etc.) keeping a reference 
to the condition argument for you, that is, so that instead of writing:

    if x: select block value [do-something-with x]

you can write:

    if select block value [do-something-with it]


The reason people say it's not worth it is usually that of having 
to bind/copy the block - you don't want that in every IF call and 
probably not even in the ones where it would be useful (and, there's 
really no other name you could use for the function).
so, I thought, can we avoid the bind/copy in any way?


actually, i think we can. some people would run in horror maybe, 
and Brian will complain about it not being thread safe (we still 
have no threads though), but what about the native was changed to 
do something like:

    func [condition block /local it*] [
        set/any 'it* get/any 'it
        it: :condition
        also
            if :condition block
            set/any 'it get/any 'it*
    ]
i don't think there would be that much code to add in the actual 
native. the same thing could be done to other similar control functions.
I guess it could trip some users, otoh, we have many things that 
trip some users.
while thinking about that, i also thought that maybe UNLESS should 
return the "condition" value when it is "true". we use this all the 
time with ANY:

   x: any [select block value "default"]

maybe it would be more readable as:

    x: unless select block value ["default"]
just thinking out loud...
BrianH
18-Dec-2009
[15256x2]
IT could be a function that returns the thread-local top of the stack 
of implied subject values. IF would then push a value on that stack, 
and pop the value off when it returns. Might be tricky to make error-throw-safe, 
but easy to make thread-safe :)
A *lot* of code uses the trick of having IF or UNLESS return none 
when the condition is not met, so your other suggestion is unlikely.
Steeve
18-Dec-2009
[15258]
A *lot* ?
somewhat exaggerated :-)
BrianH
18-Dec-2009
[15259x2]
More every day. Every time another developer learns about this (5+ 
year old) trick they start using it. It's even used in mezzanines.
It is mostly used in combination with ANY and ALL for control flow.
Steeve
18-Dec-2009
[15261x2]
i use it too,but not so much
For complex control flow rules, i rather prefer CASE.

Most of the time, combitations of ALL ANY, can be replaced by a CASE 
structure (which is faster and more readable)
BrianH
18-Dec-2009
[15263x5]
I prefer CASE too, and have rewritten many mezzanines to use it :)
It doesn't always apply to the task at hand though. The IF and UNLESS 
return values have been applied to the general R3 control flow model, 
as have the changes to the ordinal return values, map! behavior, 
...
Gabriele, it occurs to me that if IT was native it could look up 
the stack to get its value. I'll try writing a (security hole) REBOL 
version of the function later today - it would require debug privileges 
to run so that it can call the STACK function.
The advantage to this approach is that it would be error-throw-safe, 
as well as thread-safe, and require no changes to IF or UNLESS :)
R3-only of course.