r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Profiling] Rebol code optimisation and algorithm comparisons.

sqlab
30-Oct-2009
[70]
M: looks more like 50 thousands than 50 millions for repeat.
So take care of your powers of 10
Maxim
30-Oct-2009
[71x3]
as indicated in the document introductions... the repeat test is:

loop ops [repeat i 1000 []]   

so with 100000 ops taking near 2 seconds.  we end up with:

100,000 * 1000 / 2 = (50 million loops / second)
but I did miscalculate the remove-each MB/second create/scan/erase 
cycle... its not 100MB... its 10MB.
updated document.
BrianH
30-Oct-2009
[74]
Thanks for the info, Maxim. We can do a little deduction from that 
data to guess how REBOL is implemented. The scientific method :)
Maxim
30-Oct-2009
[75]
for my 3D engine, this base line test was neccessary.  I need to 
squeze every hz out of rebol... its nice to see how some exit conditions 
are 10-15% faster in some equivalebt tests...  who would have tought 
that 'PICK was faster than NOT TAIL  ?  :-/
BrianH
30-Oct-2009
[76]
For instance, 1000 > i would be faster than i < 1000 because ops 
redirect to actions, and actions dispatch based on the type of their 
first argument. If the first argument is a literal value, the action 
of that type can be called directly. If it is a referenced value, 
it woulkd have to be dereferenced first, which is apparently slower.


As for PICK being faster than NOT TAIL?, that is one native compared 
to two, with interpreter overhead between the two. Low-level natives 
like PICK, NOT and TAIL? don't do much in comparison with the interpreter 
overhead. Large numbers of small operations tend to be slower than 
small numbers of large operations, if the amount of work done is 
comparable. This is why structure manipulation is faster in REBOL 
than simple math.
Maxim
30-Oct-2009
[77x2]
better described than I would put it, but those where my assumptions... 
I will include this info in my next revision of the document.
I intend to devote a whole site to these tests eventually.  with 
a very extensive and comprehensive set of test functions and statistics.
BrianH
30-Oct-2009
[79]
With the typos fixed I hope :)
Maxim
30-Oct-2009
[80x2]
you learn a lot by doing it.
with the tests I did, I think I can probably optimise liquid by at 
least 20% just by changing the loops and changing none of the algorithms 
or features.


I am about to create a second reference liquid node type. which will 
be completely compatible from the outside, but with less features 
inside.  I expect to DOUBLE its processing capacity.
BrianH
30-Oct-2009
[82x2]
Code converted from R2 to R3 tends to need reoptimizing - the balance 
of what is native vs what is mezz has changed, usually for the better. 
All of the loop functions are native now, for instance. Also, some 
natives have been converted to actions, and vice versa.
The plus side is that what you would just do, before you try to optimize, 
tends to be faster in R3 than it is in R2.
Maxim
30-Oct-2009
[84]
yeah amd with the faster fuunction calling in R3 liquid should get 
an immediate boost in speed... it does A LOT of tiny function calls 
to manage the graph as if they where first level types with accessors 
instead of using a controler which loops over the graphs and limits 
flexibility.
Maxim
17-May-2010
[85x9]
.
just for Terry.....


rebol []


feach: func[
	series 
	value
	i "iterations"
	/local result st
][
	prin "feach(): "
	st: now/precise
	x: length? series
	
	while [i > 0][
		prin i
		prin "."
		result: make series 0 

  foreach[s p v] series [if s = value [append result reduce [s p v]]]
		i: i - 1
	]
	prin join " -> " difference now/precise st
	print ["  " (length? result) / 3 "matches found"]
	result 
]


find-fast: func [
	series
	value
	record-length
	i "iterations"
	/local result st s
][
	prin "find-fast(): "
	st: now/precise
	while [i > 0][
		prin i
		prin "."
		result: clear [] ;length? series
		s: head series
		until [
			not all[
				s: find/tail s value
				either 0 =  ( (-2 + index? s ) // record-length) [
					insert tail result copy/part series record-length
					not tail? s: skip s record-length
				][
					not tail? s: next s
				]
			]
		]
		i: i - 1
	]
	
	prin join " -> " difference now/precise st
	print ["  " (length? result) / record-length "matches found"]
	
	head result
]


find-really-fast: func [
	series
	keys
	value
	record-length
	i "iterations"
	/local result st s
][
	prin "find-really-fast(): "
	st: now/precise
	while [i > 0][
		prin i
		prin "."
		result: clear [] ;length? series
		s: head keys
		until [
			not all [
				s: find/tail s value

    insert tail result copy/part at series (record-length * ( -1 + index? 
    s )) record-length
			]
		]
		i: i - 1
	]
	
	prin join " -> " difference now/precise st
	print ["  " (length? result) / record-length "matches found"]
	
	head result
]

test-size: 2000000 ; number of records
iterations: 5
rsize: 3


print "preparing sparse data"
dataset: make block! test-size 
loop test-size [
	; we exclude 9 from dataset
	loop rsize [
		dataset: insert dataset random 8
	]
]
dataset: head dataset
insert at dataset (100 * rsize + 1) 9
insert at dataset (2000 * rsize + 1) 9
insert at dataset (50000 * rsize + 1) 9
insert at dataset (10000 * rsize + 1) 9
insert at dataset (20000 * rsize + 1) 9

probe length? dataset
print "sparse tests:"
feach dataset 9 iterations
print "----------------"
find-fast dataset 9 rsize iterations
print "----------------"
keys: extract dataset rsize
find-really-fast dataset keys 9 rsize iterations
print "----------------"


print "^/dense match with key/value collisions"
dataset: make block! test-size 
loop test-size [
	; we include 9 in the dataset
	loop rsize [
		dataset: insert dataset random 9
	]
]
dataset: head dataset
probe length? dataset
feach dataset 9 iterations
print "----------------"
find-fast dataset 9 rsize iterations
print "----------------"
keys: extract dataset rsize
find-really-fast dataset keys 9 rsize iterations
print "----------------"


print "^/dense match without key/value collisions"
dataset: make block! test-size 
loop test-size [
	; we include 9 in the dataset
	dataset: insert dataset random 9
	loop (rsize - 1) [
		dataset: insert dataset 10 + random 100000 
	]
]
dataset: head dataset
probe length? dataset
feach dataset 9 iterations
print "----------------"
find-fast dataset 9 rsize iterations
print "----------------"
keys: extract dataset rsize
find-really-fast dataset keys 9 rsize iterations
print "----------------"

ask "done"
the results speak for themselves.

preparing sparse data
6000005
sparse tests:
feach(): 5.4.3.2.1. -> 0:00:04.968   4 matches found
----------------
find-fast(): 5.4.3.2.1. -> 0:00:00.735   4 matches found
----------------
find-really-fast(): 5.4.3.2.1. -> 0:00:00.234   4 matches found
----------------

dense match with key/value collisions
6000000
feach(): 5.4.3.2.1. -> 0:00:13.328   221606 matches found
----------------
find-fast(): 5.4.3.2.1. -> 0:00:14.375   181756 matches found
----------------

find-really-fast(): 5.4.3.2.1. -> 0:00:06.453   221606 matches found
----------------

dense match without key/value collisions
6000000
feach(): 5.4.3.2.1. -> 0:00:13.734   222405 matches found
----------------
find-fast(): 5.4.3.2.1. -> 0:00:08.469   200322 matches found
----------------

find-really-fast(): 5.4.3.2.1. -> 0:00:07.516   222405 matches found
----------------
done
if you are willing to keep a version of the keys accessible at run-time, 
then the keyed search algorithm is MUCH faster, but incurs memory 
overhead, and the creation of the keys, on big blocks, is not negligible 
(time not included in tests, but will generally put the keyed algorythm 
slower than the fast-find)
also note that I am REUSING the result block at each call, so run 
times on first calls are actullaly slower than on successive calls.
noticed a little discrepancy in results... strange... probably cause 
by a last minute change...
ah found it....
the series was being skipped one too many (since I'm using /tail 
on the find, as suggested by andreas)

find-fast: func [
	series
	value
	record-length
	i "iterations"
	/local result st s
][
	prin "find-fast(): "
	st: now/precise
	while [i > 0][
		prin i
		prin "."
		result: clear [] ;length? series
		s: head series
		until [
			not all[
				s: find/tail s value
				either 0 =  ( (-2 + index? s ) // record-length) [
					insert tail result copy/part series record-length
					not tail? s: skip s record-length
				][
					not tail? s: next s
				]
			]
		]
		i: i - 1
	]
	
	prin join " -> " difference now/precise st
	print ["  " (length? result) / record-length "matches found"]
	
	head result
]
results with this function fixed are similar...

preparing sparse data
6000005
sparse tests:
feach(): 5.4.3.2.1. -> 0:00:04.89   4 matches found
----------------
find-fast(): 5.4.3.2.1. -> 0:00:00.719   4 matches found
----------------
find-really-fast(): 5.4.3.2.1. -> 0:00:00.25   4 matches found
----------------

dense match with key/value collisions
6000000
feach(): 5.4.3.2.1. -> 0:00:13.328   221606 matches found
----------------
find-fast(): 5.4.3.2.1. -> 0:00:14.843   221606 matches found
----------------

find-really-fast(): 5.4.3.2.1. -> 0:00:06.5   221606 matches found
----------------

dense match without key/value collisions
6000000
feach(): 5.4.3.2.1. -> 0:00:13.843   222405 matches found
----------------
find-fast(): 5.4.3.2.1. -> 0:00:09.672   222405 matches found
----------------

find-really-fast(): 5.4.3.2.1. -> 0:00:06.375   222405 matches found
----------------
done
Terry
18-May-2010
[94x2]
Thanks Maxim
just trying to grok it :)  so 2,000,000 keys against 6, 000,000 values?
Maxim
18-May-2010
[96x2]
setup as records of 3 items,  1 key, 2 values.
3 items  (1 key, 2 values).
Terry
18-May-2010
[98x3]
2 values?
not following ( as usual)
a typical hash (for example is [key1 value1 key2 value2]  .. one 
key, one value
Maxim
18-May-2010
[101]
in the code... 


rsize: 3   is the record size...  like the /skip value in most series 
handlers  

my two funcs will adapt, since you provide it the record size


but ... ehehe, I just realized that find HAS the /skip attribute... 
doh!!! the above can be made MUCH faster still, especially by removing 
the need for the keys (which take a bit of time to generate on large 
lists).
Terry
18-May-2010
[102x2]
my ultimate goal is storing triples..  [a1 a2 a3 b1 b2 b3...] preferably 
strings if that's fast enough

where i can search against any part of the triple , and return the 
whole thing (triple)
i'll only be generating the initial series rarely, but appending 
/changing/ removing often
Maxim
18-May-2010
[104]
ok. doable, with almost the same speed as the find-very-fast   func, 
but without the need for its keys argument.
Terry
18-May-2010
[105]
here's a better example  

dataset: [ "subject" "predicate" "value" "bob" "fname" "Bob" "bob" 
"age" "42" "Maxim" "age" "unknown" ... ]
so the triple is subject, predicate value  

with the ability to say quickly return   every triple  where "predicate" 
(the second part of the triple)  = "age"


i'm doing this now with MySQL as a triple store, but series should 
be much faster
Maxim
18-May-2010
[106]
I've got it done.... just testing it right now.
Terry
18-May-2010
[107]
if strings are quick enough and not much overhead, it's possible 
to use key/value   but the key is made up of subject:predicate ie: 
"maxim:age" "unknown"  so that select/part key ":age" would return 
everything including "unknown"
Maxim
18-May-2010
[108]
using find/skip and an index for what item to search for in the record... 
we get by FAR the fastest search... if you count the fact that generating 
the keys for find-very-fast often takes as long as the search itself, 
in dense searches.

ultimate-find below gives the results of this new function


preparing sparse data
6000005
sparse tests:
feach(): 5.4.3.2.1. -> 0:00:04.89   4 matches found
----------------
find-fast(): 5.4.3.2.1. -> 0:00:00.719   4 matches found
----------------
find-really-fast(): 5.4.3.2.1. -> 0:00:00.234   4 matches found
----------------
ultimate find(): 5.4.3.2.1. -> 0:00:00.594   4 matches found
----------------

dense match with key/value collisions
6000000
feach(): 5.4.3.2.1. -> 0:00:13.343   221606 matches found
----------------
find-fast(): 5.4.3.2.1. -> 0:00:14.813   221606 matches found
----------------

find-really-fast(): 5.4.3.2.1. -> 0:00:07.718   221606 matches found
----------------
ultimate find(): 5.4.3.2.1. -> 0:00:06.735   221606 matches found
----------------

dense match without key/value collisions
6000000
feach(): 5.4.3.2.1. -> 0:00:13.969   222405 matches found
----------------
find-fast(): 5.4.3.2.1. -> 0:00:09.812   222405 matches found
----------------

find-really-fast(): 5.4.3.2.1. -> 0:00:07.672   222405 matches found
----------------
ultimate find(): 5.4.3.2.1. -> 0:00:06.531   222405 matches found
----------------
done
Terry
18-May-2010
[109]
next will be ultra-ultimate XD
Maxim
18-May-2010
[110x2]
ultimate-find: func [
	series
	value

 index "field you want to search on, should be (1 <= index <= record-length)"
	record-length
	i "iterations"
	/local s st result
][
	prin "ultimate find(): "
	st: now/precise
	while [i > 0][
		prin i
		prin "."
		result: clear [] ;length? series
		s: at series index
		until [
			not all [
				s: find/skip s value record-length

    insert tail result copy/part skip s (-1 * index + 1) record-length
				s: skip s 3
			]
		]
		i: i - 1
	]
	
	prin join " -> " difference now/precise st
	print ["  " (length? result) / record-length "matches found"]
	
	head result
]
searching for strings will be slower... probably much slower... just 
try it out with some data  :-D
Terry
18-May-2010
[112]
what does it mean "field you want to search on"?
Maxim
18-May-2010
[113]
what item of your record you want to match against... basically what 
you meant by searching  subject, predicate or value
Terry
18-May-2010
[114]
yeah, so if i say 1 then that's all subjects?
Maxim
18-May-2010
[115]
yep it will match the value against subjects only.
Terry
18-May-2010
[116x4]
no go then
>> dataset

== [6 27744 92191 1 61175 9905 9 62225 72852 7 31935 71556 4 59248

>> ultimate-find dataset 1 1 zz 1
ultimate find(): 1. -> 0:00:00.031   0 matches found
== []
should have picked up that fourth index (1)
this worked.. 
>> ultimate-find dataset 6 1 zz 1
ultimate find(): 1. -> 0:00:00.219   1 matches found