World: r3wp

Join the discussions in the REBOL3 world...

[Core] Discuss core issues

older newer	first last
Graham 17-May-2010 [16683x2]	Pekr, it's still alpha!
Graham 17-May-2010 [16683x2]	It took 4 years to reach alpha
Pekr 17-May-2010 [16685x3]	e.g. for me, RebGUI is a dead end. I talked to Bobik, and he is back to VID for simple stuff. There were many changes lately, and some things got broken, and it does not seem to be supported anymore. As for GUI, I believe that in 2-3 months, you will be able to talk otherwise, as Robert wants to move his tools to R3 definitely ...
	I don't care about the stickers. Someone said it is alpha, so tell to your brain, that it is a beta, and you are OK, easy as that :-) I use products according to functionality, not its official alpha/beta/theta status ...
	I used Mozilla since 0.96 and it worked for me :-)
Graham 17-May-2010 [16688]	sorry, it's still alpha quality
Gabriele 17-May-2010 [16689]	Henrik: ahttp works basically like async:// - there are no docs other than the Detective sources.
Terry 17-May-2010 [16690x3]	Cheyenne is R2. I need something that works with it.
	>> a == [23 43 55 28 345 99 320 48 22 23 95 884 1000000 999999 999998 999997 999996 999995 999994 999993 999992 999991 999990 999989 999... >> (a is a block with 1, 000,012 integers) ind: func[series value x][ st: now/time/precise while [x > 0][ result: make series 0 series: head series parse series[any [series: 1 1 value (append result index? series) \| skip]] x: x - 1 ] et: now/time/precise fin: et - st print fin result ] feach: func[series value x][ result: make series 0 st: now/time/precise while [x > 0][ foreach[s p v] series [if s = value [append result reduce [s p v]]] x: x - 1 ] et: now/time/precise fin: et - st print fin result ] >> ind a 23 10 0:00:01.249 == [1 10 999990] >> >> feach a 23 10 0:00:01.01 == [23 43 55 23 95 884 23 43 55 23 95 884 23 43 55 23 95 884 23 43 55 23 95 884 23 43 55 23 95 884 23 43 55 23 95 884 23 43 55 23 9... >> 10 iterations each.. foreach is the winner speed wise.. as a bonus, If i use foreach, I don't need the index?
	Still this is too slow.. it's fine for 1M data, but 10M and it grinds hard. .. there must be a faster way to find integers in a block (or hash) using SELECT, FIND or INDEX?
Maxim 17-May-2010 [16693x7]	yes... you do a find, within a while block. that was the fastest loop I did for matching exact SAME? strings. I'd bet its the best here too.
	or did you test that already?
	the problem is that with hash! find will return both keys and data though.
	also, until is a bit faster than while IIRC.
	might want to refer to this big loop analysis I did last year... http://www.pointillistic.com/open-REBOL/moa/scream/rebol-iterator-comparison.txt
	you'd be surprised that the fastest search method is remove-each (if you account for inserting/appending as part of the loop). but you'll need to copy the initial block each time, which might take its own time.
	in your tests, this causes your loops to slow down a lot: result: make series 0 you should: result: make series length? series. because when appending, you will be re-allocating the result over and over... and the GC is the loop killer in every single big dataset tests I've done. not because its bad, per se, but because you are forcing it to operate.
Terry 17-May-2010 [16700]	Thanks Maxim, I'll check it out.
Maxim 17-May-2010 [16701x2]	I'm building a little example which I think will outperform your examples... I'm curious.
Maxim 17-May-2010 [16701x2]	I'm just about to test it.
Terry 17-May-2010 [16703]	moving the result: make series 0 out of the while loop had a 10 milli improvement over 10 iterations.
Maxim 17-May-2010 [16704]	that might account for the rarity of results. if your data has a lot of occurences, then that number will increase, since your result block will grow much more.
Gregg 17-May-2010 [16705]	You may need to move beyond brute force Terry, and set up a data structure to optimize the searching.
Terry 17-May-2010 [16706x6]	There must be a way. An index is a symbol that represents some value. What I need is to add some metadata to that index, and make that searchable.
	the goal is a blazing key/value store that's as fast as pulling by index :)
	I thought i had it with find/any .. but it doesn't work on blocks
	I haven't tried it, but my guess is storing or converting values to string to use find/any on will be slower than foreach. as in foreach [key value] n [...
	and i don't think you can search keys in hash or map! without using foreach?
	I suppose the while loop with counter, is adding extra burden in my examples as well so real world would be faster
Maxim 17-May-2010 [16712]	nope... since that is 10 ops within hundreds of millions.
Terry 17-May-2010 [16713x2]	The other thing i considered was using pairs! to as a pair of symbols, but can't search those either without foreach ie: [ 23x54 "value 1" 984x2093 "value 2"]
Terry 17-May-2010 [16713x2]	Maxim, the foreach results against strings in your doc is an interesting result.
Maxim 17-May-2010 [16715]	right now, for sparse data, I've got an algorithm that traverses 5 times faster than your foreach example. but if its really dense, then it will be slower.
Terry 17-May-2010 [16716]	I'm thinking 10,000,000 values min, and preferably to max memory
Maxim 17-May-2010 [16717x3]	10 x 10 million items, with a single value within the block... 0:00:00.234 using a 1.5GHz laptop.
	vs: 0:00:01.594 for your foreach example.
	plus record-size is variable, and is supplied as a parameter to the function.
Terry 17-May-2010 [16720]	not bad.. with index? i was getting around 2.5million/sec against 100,000
Maxim 17-May-2010 [16721]	find-fast: func [ series value record-length i "iterations" /local result st s ][ st: now/precise while [i > 0][ result: make series 0 ;length? series s: head series until [ not if s: find s value [ either 0 = mod -1 + (index? s) record-length [ insert tail result copy/part series record-length not tail? s: next s ][ s: next s ] ] ] i: i - 1 ] print difference now/precise st print length? result result ]
Terry 17-May-2010 [16722x2]	nice name ..
Terry 17-May-2010 [16722x2]	we can call the winner find-fastest
Maxim 17-May-2010 [16724x2]	this can probably be optimised further... but note that the algorithm scales with density of your block
Maxim 17-May-2010 [16724x2]	density = ratio of matches vs size of block
Terry 17-May-2010 [16726x4]	can it be modified for searching keys?
	and especially pattern matching of some sort?
	I gues find/part would work if the data were string? no?
	On your doc, regarding foreach you mentioned.. blocks get faster as you grab more at a time, while strings slow down, go figure! but the stats look the opposite?
Maxim 17-May-2010 [16730x2]	if you look at the speeds, cycling through two items at a time should give roughly twice the ops/second. when you look at the results... blocks end up being faster than this, and strings are less than this.
Maxim 17-May-2010 [16730x2]	btw, in my find-fast above... when search matchs aprox 1/10 times, it ends up being twice as slow as the foreach... so speed will depend on the dataset, as always.
Andreas 17-May-2010 [16732]	maxim, your find-fast returns the values, not the indices (not sure if that was intended)
older newer	first last