r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Core] Discuss core issues

Graham
14-May-2009
[13742]
he just needs the offsets
Steeve
14-May-2009
[13743x4]
a reference or an offset is the same thing to my mind
whereas a reference fullfills more use cases
but can't be molded
and loaded back
amacleod
14-May-2009
[13747x2]
Thanks Graham, I had something almost like that func but not quite...
This is an example of the code I used but it only hi-lites first 
occurance: 

if find face/text butt [
						hs: caret-to-offset face find face/text butt
						he: caret-to-offset face find/tail face/text butt
						fsize: face/size/x
						draw_contents: copy face/effect/draw 
						get_xys ;(Find the shape of the drawn hi-lite)
						append face/effect/draw compose/deep [(shape)]
					]
Graham
14-May-2009
[13749]
If you know the length of text "butt" you don't need to find the 
end .. just calculate it, as it should be faster.
Anton
14-May-2009
[13750]
amacleod, I wrote some efficient string search functions. The file 
has simple, first implementations alongside some efficient implementations 
which use PARSE.

http://anton.wildit.net.au/rebol/library/demo-string-search-functions.r
amacleod
14-May-2009
[13751x2]
I'll check it out.  thanks, Anton.
I got it working...

Sort of inserted my hi-lite mapping code into Graham's function...
Thanks again, Graham!
amacleod
15-May-2009
[13753]
It allowed me to step through the face one find after another. Exactly 
what I wanted to do.
Maxim
17-May-2009
[13754x2]
I'm having a problem with call.


when I use the /output/error and supply  strings , I get a strange 
error in the stdout holding string:

Unable to read from standard input: The handle is invalid.


the really strange one is that if I use /console, the expeted out 
is effectively printed out to the rebol console.... so the cmd does 
return data and rebol is able to handle it, but not within the /output/ 
refinement!


also, another call, using another command, works as expected... so 
I don't think I have a error in my code.

not that I have tried using /shell and it didn't help

has anyone seen /output and /console react this way?
not=note
Gabriele
17-May-2009
[13756]
maybe you also need to provide /input ? (it's hard to guess what 
the problem is)
Graham
17-May-2009
[13757]
oddly this group was private but everyone was in it .. made it public.
Maxim
17-May-2009
[13758x3]
I will try with/input.  providing a few "^/"  but It definitely doesn't 
need input on the command-line.
but sometimes with progamming... logic doesn't prevail, hehehe
gabriele, the /input  works !!!  damn why didn't I think of that.. 
I lost several hours trying to find other command-line ssh tools 
which would work... none really do... the putty tools really are 
the best ones out there.
TomBon
18-May-2009
[13761]
what is the fastest way to count the occurence of each numbers
within a block containing 5000+ numbers?  

e.g [1 3 6 2 4 9 33 6 67 2 12 23 34 12 2 4 56 2 ...]

calculating the max and min with a simple loop and counter is 
very time consuming for thousends of these blocks. 
inserting into mysql and doing a 'group by' also. 
any ideas for a faster solution?
Henrik
18-May-2009
[13762x2]
what if you sort the block first?
then FIND on the first occurrence of 1, of 2, of 3, of 4, etc. and 
store the indexes.
TomBon
18-May-2009
[13764x3]
yes henrik, nice idea in this moment I was thinking similar but with 
copy/part but storing the index is much faster.
the numbers for the  FIND can be supplied from a copy of the block 
with UNIQE
yep, will give it a try, thx henrik...
Sunanda
18-May-2009
[13767]
I usually use Joel's tally function -- it saves me the time needed 
to write a faster function:
http://www.rebol.org/ml-display-message.r?m=rmlKDDS
Izkata
18-May-2009
[13768x2]
I think the fastest would be a variation of radix sort, if you know 
the max/min..

Setup for example:
>> Data: array/initial 5000 does [random 200]

== [36 119 148 112 87 59 176 21 95 138 183 28 1 119 14 74 39 78 141 
99 56 71 47 174 92 81 157 182 122 123 98 73 35 170 150 11 183 8...

>> MaxVal: fold Data 0 :max ;Just use a simple loop, I defined fold 
long ago for simplicity
== 200

What I mean:
>> Vals: array/initial MaxVal 0

== [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0...
>> foreach Val Data [Vals/:Val: Vals/:Val + 1]
== 27

>> repeat I MaxVal [if 0 <> Vals/:I [print rejoin [I {, } Vals/:I 
{ times}]]]
1, 24 times
2, 20 times
..........
199, 20 times
200, 16 times
Tally looks nice, though, and not as memory-intensive
Dockimbel
18-May-2009
[13770x2]
Here's one, maybe not fully optimized, but should be fast enough 
:

count-occurences: func [data [block!] /local keys count idx][
	keys: make hash! unique sort data
	count: array/initial length? keys 0
	parse data [
		some [
			set v skip (
				idx: index? find keys v
				poke count idx 1 + count/:idx
			)
		]
	]
	reduce [keys count]
]
>> data: array/initial 5000 does [random 200]

== [36 119 148 112 87 59 176 21 95 138 183 28 1 119 14 74 39 78 141 
99 56 71 47 174 92 81 157 182 122
123 98 73 35 170 150 11 183 8...
>> t: now/time/precise
== 22:01:04.156
>> res: count-occurences data

== [make hash! [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 
21 22 23 24 25 26 27 28 29 30 31 32
 33 34 35 36 37 38 39 40 41 4...
>> now/time/precise - t
0:00:00.031
>> res/1

== make hash! [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 
21 22 23 24 25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40 41 42...
>> res/2

== [24 20 19 28 30 18 45 28 19 24 25 23 36 13 17 29 28 26 17 28 23 
24 28 22 33 31 25 27 25 28 29 31 14
 24 20 21 30 20 29 26 19 26 2...
Steeve
18-May-2009
[13772x2]
Hmm... i think tally func cuold be speed up with the Tombon's trick 
(using UNIQUE)
like, this:

fast-tally: func [b [block!] /local c t u i v] [
	c: sort copy b
	u: unique c
	t: make block! length? u
	i: 1
	v: u/1
	foreach val next u [
		c: find c val
		t: insert t as-pair v negate i - i: index? c
		v: val
	]
	c: u: none
	head t
]

probe fast-tally [1 4 5 6 3 4  6 2 7 4 3 4]
== [1x1 2x1 3x2 4x4 5x1 6x2]
Maxim
18-May-2009
[13774]
steve... why not use block parsing ?  ;-D
Steeve
18-May-2009
[13775]
i'll kill you ;-)
Maxim
18-May-2009
[13776]
LOL
Steeve
18-May-2009
[13777]
hum, there is a bug, the last value (7) is not precessed, but u got 
the idea...
Maxim
18-May-2009
[13778]
steeve you've got to stop giving examples that don't work!   ;-P
Steeve
18-May-2009
[13779]
but they work... in my mind
Izkata
18-May-2009
[13780]
arraycount: func [Data /local MaxVal Vals][
    MaxVal: Data/1
    foreach Val Data [if Val > MaxVal [MaxVal: Val]]
    Vals: array/initial MaxVal 0
    foreach Val Data [Vals/:Val: Vals/:Val + 1]
    Vals
]

So, so far:
>> Data: array/initial 5000 does [random 200]

>> time [tally copy Data] 50
== 0:00:00.561255
>> time [fast-tally copy Data] 50
== 0:00:00.34082
>> time [count-occurences copy Data] 50 
== 0:00:00.610159
>> time [arraycount copy Data] 50      
== 0:00:00.250419
Steeve
18-May-2009
[13781]
Don't know if it's faster, but it's a bug free and consumes less 
memory.

fast-tally: func [b [block!] /local c u i] [
	c: sort copy b
	u: unique c
	i: 1
	forall u [
		c: any [find c u/2 tail c]
		u/1: as-pair u/1 negate i - i: index? c
	]
	c: none
	head u
]
Izkata
18-May-2009
[13782]
Looks mostly the same in speed:
>> time [fast-tally1 copy Data] 50
== 0:00:00.379225
>> time [fast-tally2 copy Data] 50
== 0:00:00.392215
Steeve
18-May-2009
[13783x3]
hum, i forgot, forall is slow within R2...
Should be a little better...

fast-tally: func [b [block!] /local c u i] [
	c: sort copy b
	u: unique c
	i: 1
	until [
		c: any [find c u/2 tail c]
		u/1: as-pair u/1 negate i - i: index? c
		tail? u: next u
	]
	c: none
	head u
]
izkata, in R2 maximum-of is a native function.
with that, your solution should be defintivly the fastest
Izkata
18-May-2009
[13786x2]
Sorting seems to be the big bottleneck:


arraycount with sorted data (using maximum-of, same results as sorted):
>> time [arraycount copy SDat] 50
== 0:00:00.172956
fast-tally with unsorted data:
>> time [fast-tally copy Data] 50
== 0:00:00.357852
fast-tally with the sort step removed, with pre-sorted data:
>> time [fast-tally copy SDat] 50 
== 0:00:00.103735
So which is best depends on what Tom is using it for, I guess
TomBon
19-May-2009
[13788]
this is great, very fast code. thanks a lot for this.
the code is used for generating dynamic parsing rules 
for financial market data. soon I will release a free
software based on this data.
Steeve
19-May-2009
[13789x2]
Seems that Tally is faster with R3.
2 reasons:
SORT is faster
MAXIMUM-OF is not anymore native 

(btw some of us claimed that it was a bad idea to have maximum-of 
 not anymore native in R3, But will claimed in the desert, so now 
we can see the drawback)
(Tally faster than radix, i meant)
BrianH
19-May-2009
[13791]
MAXIMUM-OF was rarely used, and is now mostly native rather than 
fully native. The majority of its processing is the loop, which is 
now native. MAXIMUM-OF and MINIMUM-OF are so rarely used that they 
might not make the mezzanine cut; they might be moved to R3/Plus.