r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Core] Discuss core issues

BrianH
11-Dec-2010
[770]
That works too:

>> sort/compare [20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 
1] func [x y] [case [x < y [3] x = y [0] 'else [-4]]]
== [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]
Gabriele
11-Dec-2010
[771x2]
Are we talking R2's sort or R3's sort here? I know for a fact that 
R2's sort is stable. Obviously you can't get a stable sort if you 
use a comparator function and you return a logic! value.
Also, the current "best" algorithm appears to be http://en.wikipedia.org/wiki/Timsort
Steeve
11-Dec-2010
[773]
We must focus on algos which are doing the fewer comarisons and are 
fast with data almost sorted (it's our common case).

in that sense, Timsort would be a good choice because it's unsing 
a combination of merge sort + insertion sort
merge sort = fewer comparisons.

insertion sort = the faster one on data already sorted and small 
subsets of data
Maxim
12-Dec-2010
[774]
yes looks pretty decent... a nice wrap up here:
http://corte.si/posts/code/timsort/index.html
Ladislav
13-Dec-2010
[775]
http://www.rebol.org/ml-display-message.r?m=rmlVPRF
Steeve
14-Dec-2010
[776x3]
Funny, I've coded "bottom-up heapsort"  last night (improved version 
of heapsort).
I give the "clean" version.
heapify: func [s start len comp /local child sav][
	;-- search terminal leaf.
	child: start
	while [2 * child < len][
		child: 2 * child
		unless (comp s/(++ child) s/:child) [-- child]
	]
	if 2 * child = len [child: len]

	;-- bottom-up, search insertion point
	while [comp s/:child s/:start][child: shift child -1]
		
	;-- bottom-up swap
	sav: s/:start
	while [child > start][
			s/:child: also sav sav: s/:child
			child: shift child -1
	]
	s/:child: sav
]

heapsort: func [serie comp /local len][
	len: length? serie
	;-- build heap
	for i shift len -1 1 -1 [heapify serie i len :comp]
	;-- sort
	for i len 1 -1 [
		swap serie at serie i
		heapify serie 1 i - 1 :comp
	]
	serie
]
>> heapsort serie func [a b][a < b]
Ladislav
14-Dec-2010
[779x2]
Is it stable, Steeve?
>> heapsort [5 1 2 4] :lesser?
== [1 2 4 5]

, while

>> heapsort [5 1 2 4] :lesser-or-equal?
** Script error: cannot compare none! with integer!
** Where: comp while heapify for heapsort
** Near: comp s/:child s/:start

, is that intended?
GrahamC
14-Dec-2010
[781]
Mindblock

a: "testing"

foreach v [ a ] [  .... ]

in .. .how to test if v is an empty? string?
Ladislav
14-Dec-2010
[782]
don't you mean:

foreach v reduce [a] [if all [string? :v empty? :v] [...]...]
GrahamC
14-Dec-2010
[783]
I didn't want to reduce first as then I can't report which one is 
empty?
Ladislav
14-Dec-2010
[784]
I do not understand
GrahamC
14-Dec-2010
[785x2]
say I have a: "testing" b: ""

how would I say .. variable b is empty?
If I reduce the block of words first, then I no longer know which 
word is empty
Ladislav
14-Dec-2010
[787]
if you have foreach v [a] [...] , then v is a word, not a string, 
so, in case you really mean it, you need something like:

if all [string? get v empty? get v]
GrahamC
14-Dec-2010
[788x2]
sounds right
thanks
Steeve
14-Dec-2010
[790x4]
Ladislav, I corrected the issue with :less-or-equal?
And made some optimizations (I hope so).
On large serie
heapify: func [s start len comp /local step sav inc][
	inc: 0
	
	;-- search terminal leaf.
	step: start
	while [2 * step < len][
		++ inc
		step: 2 * step
		unless (comp s/(++ step) s/:step) [-- step]
	]
	if 2 * step = len [++ inc step: len]

	;-- bottom-up, search insertion point
	loop inc [
		unless (comp s/:step s/:start) [break] 
		step: shift step -1
		-- inc
	]
	
	;-- bottom-up swap
	switch/default inc [
		1 [swap at s start at s step]	;-- single swap
		0 []							;-- no swap
	][
		sav: s/:start					;-- chain swap
		loop inc [
				s/:step: also sav sav: s/:step
				step: shift step -1
		]
		s/:step: sav
	]
]

heapsort: func [serie comp /local len][
	len: length? serie
	
	;-- build heap
	for i shift len -1 1 -1 [heapify serie i len :comp]
	
	;-- sort
	for i len 1 -1 [
		swap serie at serie i
		heapify serie 1 i - 1 :comp
	]
	serie
]
s: make block! len: 1000
loop len [append s random len]
s2: copy s

n: 0
heapsort s func [a b][++ n a < b]
print ["bottom-up heapsort, number of comparisons =" n]

n: 0
sort/compare s2 func [a b][++ n a < b]
print ["Rebol sort (R3 + Vista), number of comparisons =" n]
bottom-up heapsort, number of comparisons = 10301
Rebol sort (R3 + Vista), number of comparisons = 12726
Ladislav
14-Dec-2010
[794x4]
Steeve, my measurements suggest, that your previous version was a 
bit faster (is it caused by the fact, that you have to provide also 
for :lesser-or-equal?
My result:

random/seed 0
s: make block! len: 1000
loop len [append s random len]

n: 0
heapsort copy s func [a b][++ n a < b]
print ["bottom-up heapsort, number of comparisons =" n]

n: 0
sort/compare copy s func [a b][++ n a < b]

print ["Rebol sort (R3 + Windows 7 Home Premium), number of comparisons 
=" n]

n: 0
msort copy s func [a b][++ n a < b]
print ["Merge sort, number of comparisons =" n]
bottom-up heapsort, number of comparisons = 10406

Rebol sort (R3 + Windows 7 Home Premium), number of comparisons = 
12726
Merge sort, number of comparisons = 8715
BTW, the "information-theoretical limit" of comparisons is: 8530
Steeve
14-Dec-2010
[798x2]
I should have a look on your merge implementation.

It's said that "merging" merge with insertion sort give better results
but as it is now, you got pretty nice  results
Ladislav
14-Dec-2010
[800x2]
merging merge with insertion sort give better results

 - actually, it depends; "merging merge with insertion sort" gives 
 worse results from the information-theoretical POV (IMO), since you 
 "prefer certain permutations" (= give them higher probability)
Thus, you can sort certain permutations (the ones already sorted) 
much faster, than is the "information theoretical limit", but at 
the cost of exceeding it noticeably sorting other permutations.
Steeve
15-Dec-2010
[802x2]
Searching for an optimal (small and fast) implementation of the following 
pattern.
* Swap two subsets inside a serie.

input 
	block:  [4  5  6   1 2] (5 values)
	Starting index of the 2nd subset inside the block: 4

Output:
	[ 1 2    4  5  6] 

Easy to do in plain Rebol right ?

But here's the trouble, It must be memory safe. You're not allowed 
to do any memory allocation.

You're can only swap values inside the block. And the number of swaps 
should be optimal.
(no sort, no parse, no copy/make/insert/append/change)
Don't want something recursive aswell
Sunanda
15-Dec-2010
[804]
Just for starters.....This does it with 12 XORs (three per swap).

But the tricky bit may be pre-computing the from-list and to-list 
mapping

;; function to do the swap

swap-items: func [
   data [block!]
   from-list [block!]
   to-list [block!]
   /local
    ind1
    ind2
 ][
 for n 1 length? from-list 1 [
     ind1: from-list/:n
     ind2: to-list/:n
 
     data/:ind1: xor data/:ind1 data/:ind2
     data/:ind2: xor data/:ind1 data/:ind2
     data/:ind1: xor data/:ind1 data/:ind2
     ]
     return data
    ]

;; Sample run    
     block: [4 5 6 1 2]
     probe swap-items block [1 2  1 1] [3 4 5 2]
    [1 2 4 5 6]
Steeve
15-Dec-2010
[805x3]
I should have said: the values can be of any type,.integers or anything 
else.

You don't need to find a tricky way to swap values. The purpose is 
not to find how to swap values.

The purpose is to find an algorithm with a minimal amount of single 
swaps .
>> swap-sub [a b 1 d  z 3   X 3 Y ]  7 
== [ X 3 Y   a b 1 d z 3]
in R3, you can swap values like this:
swap [a] [b]
in R2
a: also b a: b

Or use a tmp variable, as you want.
Any kind of these methods count as 1 swap.
Ladislav
15-Dec-2010
[808]
what does that 7 in swap-sub [a b 1 d  z 3   X 3 Y ]  7 mean?
Steeve
15-Dec-2010
[809x3]
7 is the index if the second subset in the serie
where X stand
I try to swap [[a b 1 d  z 3]   [X 3 Y ]]
minus the sub blocks
Ladislav
15-Dec-2010
[812]
OK, thanks
Andreas
15-Dec-2010
[813]
Not optimal, but a start:


bubble-to-front: funct [series index] [for i index 2 -1 [swap b: 
at series i back b] series]

swap-sub: funct [series position] [loop (n: length? series) - position 
+ 1 [bubble-to-front series n] series]
Sunanda
16-Dec-2010
[814]
I've written some very clunky code that I'd be ashamed to post as 
a solution.

But I can offer you an algorithm that acheives the effect in N-1 
swaps at most where 
N is the sum of the lengths of the two sequences.
It's the more-or-less same algorithm used by Andreas.

Here's how it works. Given these two sequences:
       a b c    1 2 3 4 5 6 7

Step1: cyclically rotate the longer sequence M times, where M is 
the difference in length of the sequences. So in this case, we rotate 
3 (7 - 4) times:
       a b c    4 5 6 7    1 2 3


Step2: swap the elements of the shorter sequence with the tail of 
the longer one:
       1 2 3    4 5 6 7    a b c
And it's done.


The cycling in place is the tricky part. It can be done, but my code 
is just too ugly to share :(

Andreas's bubble-to-front is an elegant approach to doing the cycling, 
but is not optimed to reduce the number of steps.

It's a managable sub-problem that is a challenge to solve, so I am 
sure someone can do better than me :)
Ladislav
16-Dec-2010
[815x3]
; helper function:
swap-first: func [
	{swap the first elements of A and B}
	a [series!]
	b [series!]
	/local t
][
	set/any 't first a
	change/only a first b
	change/only b get/any 't
]
; implementation:
swap-sub: func [
	{swap the subseries using the SWAP-FIRST function}
	a [series!]
	b [integer!]
	/local la lb pa pb
][
	pa: a
	la: b - 1
	pb: skip a la
	lb: (length? a) - la
	while [all [la > 0 lb > 0]][
		either la <= lb [
			loop la [
				swap-first pa pb
				pa: next pa
				pb: next pb
			]
			pb: skip pa la
			lb: lb - la
		][
			pa: skip pa la - lb
			loop lb [
				swap-first pa pb
				pa: next pa
				pb: next pb
			]
			pa: skip pa negate la
			la: la - lb
			pb: skip pa la
		]
	]
	a
]
but, I do not have a proof at hand, that it is optimal
Sunanda
16-Dec-2010
[818]
I had a similar volume of code, but not nearly as neat, Ladislav.


The problem somehow feels that it ought to have a one-liner solution; 
but the constaints on what can be used in the code make that hard 
to find :)
Rebolek
16-Dec-2010
[819]
looks more like rebcode :)