World: r3wp
[!REBOL3-OLD1]
older newer | first last |
Ladislav 2-Jul-2009 [15958x4] | r-uni is integer, guaranteeing uniformity by using rejection |
sorry, ignore my post | |
r-uni surely differs from r when X is equal to 1.0 | |
(since in that case no additional rounding occurs) | |
Geomol 2-Jul-2009 [15962x2] | yeah, r-uni is better, I think. |
Nice talk. Time for me to move to other things... | |
Ladislav 2-Jul-2009 [15964] | bye |
Geomol 3-Jul-2009 [15965x5] | for random 1.0 you cannot find any irregularities, there aren't any I think, there are. Decimals with a certain exponent are equal spaced, but there are many different exponents involved going from 0.0 to 1.0. The first normalized decimal is: >> to-decimal #{0010 0000 0000 0000} == 2.2250738585072e-308 The number with the next exponent is: >> to-decimal #{0020 0000 0000 0000} == 4.4501477170144e-308 I can take the difference: >> (to-decimal #{0020 0000 0000 0000}) - to-decimal #{0010 0000 0000 0000} == 2.2250738585072e-308 and see the difference double with every new exponent: >> (to-decimal #{0030 0000 0000 0000}) - to-decimal #{0020 0000 0000 0000} == 4.4501477170144e-308 >> (to-decimal #{0040 0000 0000 0000}) - to-decimal #{0030 0000 0000 0000} == 8.90029543402881e-308 >> (to-decimal #{0050 0000 0000 0000}) - to-decimal #{0040 0000 0000 0000} == 1.78005908680576e-307 So doing random 1.0 many times with the current random function will favorize 0.0 a great deal. The consequence is, 0.0 will come out many more times than the first possible numbers just above 0.0, and the mean value will be a lot lower than 0.5. |
The space between possible decimals around 1.0 is: >> (to-decimal #{3ff0 0000 0000 0000}) - to-decimal #{3fef ffff ffff ffff} == 1.11022302462516e-16 The space between possible decimals around 0.0 is: >> to-decimal #{0000 0000 0000 0001} == 4.94065645841247e-324 That's a huge difference. So it'll give a strange picture, if converting the max output of random (1.0 in this case) to 0.0. | |
It's easier to illustrate it with an image: http://www.fys.ku.dk/~niclasen/rebol/random-dist.png The x-axis is the possible IEEE 754 numbers going from 0.0 to 1.0. The y-axis is how many 'hits' ever possible number gets, when doing RANDOM 1.0. Every gray box holds the same amount of possible number, namely 2 ** 52. I use the color to illustrate the density of numbers. So the numbers lie closer together at 0.0 than at 1.0. The destribution is of course flat linear, if the x-axis was steps of e.g. 0.001 or something. There is the same amount of hits between 0.001 and 0.002 as between 0.998 and 0.999. It's just, that there are many more possible numbers around 0.001 than around 0.999 (because of how the standard IEEE 754 works). | |
It should be clear, that it's a bad idea to move the outcome giving 1.0 to 0.0, as is done now with the current RANDOM function in R3. | |
I added another comment to ticket #1027 in curecode. | |
Ladislav 3-Jul-2009 [15970x2] | but, you did not take into account, that the spacing of 4.94065645841247e-324 is not used |
(by the implementation) | |
Geomol 3-Jul-2009 [15972x4] | Oh! Yes, I didn't have that in mind. So the smallest result larger than zero from RANDOM 1 is: >> tt62: to integer! 2 ** 62 >> 1 / tt62 == 2.16840434497101e-19 It's still smaller than 1.11022302462516e-16 Can RANDOM 1.0 produce a result equal to 1.0 - 1.11022302462516e-16 ? |
Hm, this is not trivial! :-) | |
Yes, it can. If random tt62 result in tt62 - 257 >> 1.0 - (tt62 - 257 / tt62) == 1.11022302462516e-16 So the problem is there, just not as big as I first thought. | |
The first random function was: tt62: to integer! 2 ** 62 r: func [x [decimal!]] [(random tt62) - 1 / tt62 * x] and if (random tt62) - 1 result in tt62 - 257, the space between numbers are smaller than if (random tt62) - 1 result in 1. Hope I make sense. | |
Ladislav 3-Jul-2009 [15976] | yes. If the difference is detectable by a test, then we should change the implementation |
Paul 3-Jul-2009 [15977x3] | I found some interesting observations in working with Random data recently. I was mostly working with the RandomMilliondigit.bin file data that is used to test compression algorithms. It exhibits a characteristic in that repetitious data is ascends in almost a 1-2 ratio. |
Additionally, the distribution of bits are tight. Meaning that the distribution of 10, 01, 00, 10 bit sequences for example are very close to the same quantity. | |
For example, in set of data you might find how many occurences of 3 '0 bit squences you have in the data. Once you now the number of the occurences then for random data the 4 '0 bit sequences should be about half that and so on. | |
Henrik 4-Jul-2009 [15980] | Cool, time! has been reimplemented. |
Ladislav 4-Jul-2009 [15981] | hi, please check http://www.rebol.net/wiki/Comparisons, I am e.g. unusure about the relation between timezone and STRICT-EQUAL? |
PeterWood 4-Jul-2009 [15982] | I would have thought that strict-equal? would imply the same timezone. |
Ladislav 4-Jul-2009 [15983x3] | yes, it is possible to set it that way |
(differs from the current behaviour) | |
are there other votes for the change? | |
BrianH 4-Jul-2009 [15986x2] | I would have thought EQUIVALENT? would imply the same time zone, or at least some time zone math. Certainly STRICT-EQUAL?. |
differs from the current behaviour - The time! type is being rewritten right now because itts current behavior is bad. | |
Ladislav 4-Jul-2009 [15988x2] | Equivalent? - it would use time zone math, even EQUAL? now uses time zone math: >> equal? 7/7/2009/10:00 7/7/2009/10:00-02:00 == false >> equal? 7/7/2009/10:00 7/7/2009/8:00-02:00 == true |
(i.e. UTC is compared) | |
BrianH 4-Jul-2009 [15990] | As long as no zone is equal? to zone of 0:00 I am happy. |
Ladislav 4-Jul-2009 [15991x2] | no zone is actually 0:00 zone |
>> 7/7/2009/10:00/0:0 == 7-Jul-2009/10:00 | |
BrianH 4-Jul-2009 [15993] | t/zone = none, but basically yeah |
Ladislav 4-Jul-2009 [15994] | aha, did not notice, that there are actually two zones: zone = 0:00 and zone = none |
BrianH 4-Jul-2009 [15995x2] | It's a recent bug fix. Wait, isn't STRICT-EQUAL? supposed to be EQUIVALENT? plus a datatype check? Then if EQUIVALENT? already does zone math I vote that STRICT-EQUAL? do zone math and SAME? do precise equivalence. |
Wait again, == also does case checking. Does EQUIVALENT? do case checking? | |
Ladislav 4-Jul-2009 [15997x2] | Equivalent?: ignores datatype differences, alias distinctions, character case differences |
so, more differences between Strict-equal? and Equivalent? | |
BrianH 4-Jul-2009 [15999] | Then this seems like a good add, unless it is reserved for =? |
Ladislav 4-Jul-2009 [16000x2] | you mean to add the distinction of timezone to Strict-equal? - yes looks natural |
t/zone = none - how did you get that? | |
BrianH 4-Jul-2009 [16002] | Yeah. (Bad hand, bad keyboard day) |
Ladislav 4-Jul-2009 [16003x2] | OK, *adding the timezone distinction to Strict-equal?* |
BTW, I should probably put the tests away from the main article, otherwise it is quite unmanageable, what do you think? | |
BrianH 4-Jul-2009 [16005x2] | Oh, the t/zone is really d/zone - it's a date! thing, recently fixed. |
Please moo their own page. | |
Ladislav 4-Jul-2009 [16007] | yes, but still, it is not none, but 0:00 |
older newer | first last |