r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3-OLD1]

Geomol
2-Jul-2009
[15917x2]
>> (dec-as-int 0.1) - (dec-as-int 0.0)
== 4591870180066957722
>> (dec-as-int 0.2) - (dec-as-int 0.1)
== 4503599627370496

Same result. Now I'm confused! :-)
If this is correct, then my suggestion to do
random 1.0
many times will actually show the problem.
Ladislav
2-Jul-2009
[15919x2]
yes, you get different result only when trying (dec-as-int 0.0) - 
(dec-as-int -0.0)
etc.
Geomol
2-Jul-2009
[15921x5]
The highest decimal is
>> to-decimal #{7fef ffff ffff ffff}
== 1.79769313486232e+308


There is the same number of ulps between 0.0 and 2.0 as between 1.0 
and the highest decimal:

>> ulps 0.0 2.0
== 4611686018427387904
>> ulps 1.0 to-decimal #{7fef ffff ffff ffff}
== 4611686018427387903


I think, the space between number are different all the way from 
zero to the highest decimal, but I'm not sure.
space between *numbers* are ...
Funnily enough, 1.5 is the middle value when talking ulps:

>> ulps 0.0 1.5
== 4609434218613702656
>> ulps 1.5 hi
== 4609434218613702655
where hi is:
hi: to-decimal #{7fef ffff ffff ffff}
This mean, there is as many different possible decimals between 0.0 
and 1.5 as between 1.5 and the highest possible 64-bit decimal, when 
talking floating points in a computer using the 64-bit IEEE 754 standard.
Ladislav
2-Jul-2009
[15926x2]
how about this, do you think, that there is a problem with it too?
two**62: to integer! 2 ** 62
two**9: to integer! 2 ** 9
two**53: to integer! 2 ** 53
r-uni: func [x [decimal!] /local d] [
	d: random two**62
	; transform to 0..2**53-1
	d: d // two**53
	; transform to 0.0 .. x (0.0 inclusive, x exclusive)
	d / two**53 * x
]
a modification:

two**62: to integer! 2 ** 62
two**53: to integer! 2 ** 53
two**53+1: two**53 + 1
r-uni: func [x [decimal!] /local d] [
	d: random two**62
	; transform to 0..2**53
	d: d // two**53+1
	; transform to 0.0 .. x (0.0 inclusive, x inclusive)
	d / two**53 * x
]
BrianH
2-Jul-2009
[15928]
When you two are done, could you look at CureCode ticket #1027 and 
R3's alpha 66 behavior and see if it is correct? I can't judge.
Ladislav
2-Jul-2009
[15929x2]
LOL, that is what we are just discussing
anyway, this is always a problem, when we try to generate numbers 
in an "exotic" interval, not in the usual [0.0 .. 1.0]
Geomol
2-Jul-2009
[15931]
I'll just grab some food, then I look at your r-uni.
Ladislav
2-Jul-2009
[15932]
hmm, seems, that the former may still produce X when rounding?
Gregg
2-Jul-2009
[15933]
In any case, this chat should be the basis for the docs that go with 
it. People aren't aware of how much thought goes into these things.
Geomol
2-Jul-2009
[15934x2]
>> two**62 / two**53+1
== 512.0

So two**53+1 * 512 should equal two**62, but it doesn't:

>> two**62
== 4611686018427387904
>> two**53+1 * 512
== 4611686018427388416

Why is that?
(I wanted to check , if d // two**53+1 gave an even distribution, 
but there seem to be something wrong with calculations of large numbers.)
Ladislav
2-Jul-2009
[15936]
Why is that?
 - rounding
Geomol
2-Jul-2009
[15937]
This should give an math overflow, I think:

>> to integer! 2 ** 62
== 4611686018427387904

Like this does in R2:

>> to integer! 2 ** 31
** Math Error: Math or number overflow
Ladislav
2-Jul-2009
[15938]
no, R3 uses 64-bit integers
Geomol
2-Jul-2009
[15939x3]
no wait, it's 2 ** 63, that should give overflow.
yeah
But!? :-)

>> to integer! 2 ** 62 - 1
== 4611686018427387904
>> to integer! 2 ** 62
== 4611686018427387904

Isn't that a bug?
Ladislav
2-Jul-2009
[15942]
no, it is the precision limit of IEEE754
Geomol
2-Jul-2009
[15943]
I see.
Ladislav
2-Jul-2009
[15944]
(I mean, it exceeds the precision limit)
Geomol
2-Jul-2009
[15945x2]
Yes, I understand.
After you do:

d: d // two**53+1


Are we sure, the new d is even distributed? That every value, d can 
now take, is equal possible?
Ladislav
2-Jul-2009
[15947x2]
hmm, to make sure, I should use rejection
as follows:

rnd: func [
	{random range with rejection, not using last bits}
	value [integer!]
	/local r s
] [
	s: two-to-62 - (two-to-62 // value)
	until [
		r: random two-to-62
		r <= s
	]
	s: s / value
	r - (r // s) / s + 1
]
Geomol
2-Jul-2009
[15949]
If all possible values of d is equal possible, and d now can have 
values from 0 to two**53, then

d / two**53

will return values from 0.0 to 1.0 both inclusive, and 

d / two**53 * x

will return values from 0.0 to x both inclusive. Right?
Ladislav
2-Jul-2009
[15950x2]
yes
...multiplication may still trigger rounding, though
Geomol
2-Jul-2009
[15952]
It give same result as your first RANDOM:

d: to-decimal to-binary 1023
blk: []
insert/dup blk 0 1024
random/seed now

loop 1000000 [i: to-integer to-binary r-uni d blk/(i + 1): blk/(i 
+ 1) + 1]

>> print [blk/1 blk/512 blk/1024]
462 1036 481
Ladislav
2-Jul-2009
[15953]
that is caused by the multiplication rounding
Geomol
2-Jul-2009
[15954x2]
yes
I like it better than the version, where max is changed to 0.0
Ladislav
2-Jul-2009
[15956]
another variant would be to generate the uniform deviates in the 
definite interval not allowing multiplication to "mess things"
Geomol
2-Jul-2009
[15957]
Is r-uni better than the version, you gave earlier?

tt62: to integer! 2 ** 62
r: func [x [decimal!]] [(random tt62) - 1 / tt62 * x]
Ladislav
2-Jul-2009
[15958x4]
r-uni is integer, guaranteeing uniformity by using rejection
sorry, ignore my post
r-uni surely differs from r when X is equal to 1.0
(since in that case no additional rounding occurs)
Geomol
2-Jul-2009
[15962x2]
yeah, r-uni is better, I think.
Nice talk. Time for me to move to other things...
Ladislav
2-Jul-2009
[15964]
bye
Geomol
3-Jul-2009
[15965x2]
for random 1.0 you cannot find any irregularities, there aren't any


I think, there are. Decimals with a certain exponent are equal spaced, 
but there are many different exponents involved going from 0.0 to 
1.0. The first normalized decimal is:

>> to-decimal #{0010 0000 0000 0000}
== 2.2250738585072e-308

The number with the next exponent is:

>> to-decimal #{0020 0000 0000 0000}
== 4.4501477170144e-308

I can take the difference:


>> (to-decimal #{0020 0000 0000 0000}) - to-decimal #{0010 0000 0000 
0000}
== 2.2250738585072e-308

and see the difference double with every new exponent:


>> (to-decimal #{0030 0000 0000 0000}) - to-decimal #{0020 0000 0000 
0000}
== 4.4501477170144e-308

>> (to-decimal #{0040 0000 0000 0000}) - to-decimal #{0030 0000 0000 
0000}
== 8.90029543402881e-308

>> (to-decimal #{0050 0000 0000 0000}) - to-decimal #{0040 0000 0000 
0000}
== 1.78005908680576e-307

So doing
random 1.0

many times with the current random function will favorize 0.0 a great 
deal. The consequence is, 0.0 will come out many more times than 
the first possible numbers just above 0.0, and the mean value will 
be a lot lower than 0.5.
The space between possible decimals around 1.0 is:


>> (to-decimal #{3ff0 0000 0000 0000}) - to-decimal #{3fef ffff ffff 
ffff}
== 1.11022302462516e-16

The space between possible decimals around 0.0 is:

>> to-decimal #{0000 0000 0000 0001}
== 4.94065645841247e-324


That's a huge difference. So it'll give a strange picture, if converting 
the max output of random (1.0 in this case) to 0.0.