World: r3wp
[!REBOL3]
older newer | first last |
BrianH 19-Apr-2010 [2132x8] | Pekr, you do realize that TO-INTEGER #{8000} is a conversion of an incomplete binary, an operation, right? And that 0x8000 is syntax for an integer value? REBOL doesn't have hex syntax for integers, or any default interpretation of binary values as being of a different datatype. Just like Python doesn't have syntax for binary values (unless I'm mistaken about that last bit). |
#{8000} could just as easily be an incomplete decimal or money, or a malformed character, as it could an incomplete integer. | |
#{8000} is a concrete value. It just isn't an integer. | |
#{8000000000000000} is the binary equivalent to an integer, though not without explicit conversion. | |
Any padding is done during the conversion, but that is only for your convenience. It is not an inherent quality of the source value. | |
All of the TO whatever binary! conversions also allow the binary to be longer than the target value, ignoring the rest of the data. This comes from the assumption that the binary is a stream that you are converting and the rest of the stream is other values that you will be converting later. If the value is too short then it is assumed that you did a COPY/part on the stream for alignment and padding purposes, so it will be nice to you, but direct operations on binaries are assumed to have comparable lengths. And there are no implicit conversions to or from binaries, as a rule. The behavior is very consistent. | |
>> to-binary -0.0 == #{8000000000000000} >> to-binary -9223372036854775808 == #{8000000000000000} So, what does #{8000000000000000} mean? It means #{8000000000000000}, nothing more without explicit conversion. | |
Out of curiosity, does Python have a literal syntax for an array of bytes? That would be the equivalent of binary!. | |
Andreas 19-Apr-2010 [2140x2] | Yes, it has. |
Python3, that is: b'....'. | |
BrianH 19-Apr-2010 [2142] | Thanks, now at least we have something to compare :) |
Andreas 19-Apr-2010 [2143] | And in Python 2, a (non-unicode-) string is nothing but an array of bytes. |
BrianH 19-Apr-2010 [2144] | Not that dissimilar to R2, that. |
Andreas 19-Apr-2010 [2145] | Quite similar in fact, except for the default encoding. |
Anton 19-Apr-2010 [2146] | Pekr, I agree with BrianH (as I almost always do). It seems the confusion is that C or Python's integer representation syntax and Rebol's binary datatype look similar, because they both use hexadecimal. But 0x8000 in C or Python is really an integer and has to fit inside an integer type with some specific size. Rebol's binary type is a series type and can be as long or short as you want (well, measured in 8-bit chunks, octets), and it doesn't make any assumptions as to what the meaning of those octets is. In Rebol I miss being able to represent integers the way C does, it makes translation a bit more difficult. |
Andreas 19-Apr-2010 [2147] | And just to add my two cents: I find the current R3 behavior to be easily understandable and totally sane. |
Anton 19-Apr-2010 [2148] | I haven't really gotten into R3 yet. I only played with binary/integer conversions in R2. |
Andreas 19-Apr-2010 [2149x2] | Things are much better in R3 :) |
Imagine TO-BINARY integer! just working like it's supposed to :) Heaven! | |
Anton 19-Apr-2010 [2151] | Yes, that does look better :) |
Andreas 19-Apr-2010 [2152x4] | So just to add what Brian already explained nicely above: #{8000} is syntax for a sequence of bytes. The equivalent in Python would be b'\x80\x00'. 32768 is syntax for an integer, same in Python. Additionally, Python has alternative syntax for the same integer: 0x8000, 0o10000, 0b1000000000000000; for those literals, REBOL has no corresponding syntax. |
And I don't think those literals are missed, in REBOL. | |
As you can already glance from the above, we have a nicer literal way to specify sequences of bytes. This goes even further: we can write #{F0} as 2#{11110000}, for example. | |
And the only important thing for correctly using binaries, is not to mistake a binary for some other type just because it could validly represent this other type. #{0001} is neither 1, nor true, nor $1, nor #1, nor ... | |
Pekr 20-Apr-2010 [2156x3] | You still talk about the syntactic sugar, but that is imo irrelevant. If we have so cool and open dtype, which can represent "anything", then allowing integer to binary and vice-versa conversion, should be prohibited! How is that we decided, that to-integer #{8000} is actually 32768? That is absolutly concrete value, it does not allow any other interpretation, no? So how is that, when ORring with such a value, such expectation does not stand anymore? |
I still think, that OR/AND applied from the right side would not hurt anyone. It would just work correctly imo. Max's explanation, that binary is just a stream does not imo stand any valid argument here, because - when you already decide to apply AND/OR, you decide at certain time, with certain known binary value,no matter wher in the stream you are ... | |
So once again ... the following result is imo insane, and I wish a luck anyone trying to explain it in the docs :-) >> l: to-binary 1022 == #{00000000000003FE} >> l or #{8000} == #{80000000000003FE} | |
Anton 20-Apr-2010 [2159x3] | You just ORed together two binaries. How the hell is rebol supposed to know that you consider them to represent 64-bit integers? The above expression does not specify anywhere that these are to be treated as 64-bit integers. |
to-integer #{8000} -> 32768 "... it does not allow any other interpretation, no?" --- Yes, it could easily allow many other interpretations, for example, the first byte of the binary could become the least significant 8 bits of the integer, instead of the most significant 8 bits as it is now. | |
Using language like "insane" to describe the above is hyperbole. I don't respect that; it's an overreaction. Why not say instead, "the following result is still confusing to me". (I'm having the same problem with my parents and with myself, we are using too much extreme inaccurate language, and it's causing emotional stress between us.) | |
Steeve 20-Apr-2010 [2162] | Wow, what a fuss :) What prevents you to adjust the size of the binary to fit the chosen size ? >>(skip to-binary 1022 6) or #{8000} ==#{8324} |
Ladislav 20-Apr-2010 [2163] | Pekr, you are right when saying, that binary OR uses different padding (right) than conversion to integer (left). As far as I am concerned, the left padding for conversion to integer looks more convenient than right padding, "producing" a useful result more often. Regarding the OR operation: I guess, that it *could* use left padding too, but, in that case, I am not sure, whether left padding would produce a more useful result more often, than right padding. (although you provided one case, where left-padding *would* be more useful) Nevertheless, I do not think, that padding of these two operations has to be the same |
Steeve 20-Apr-2010 [2164] | #{83FE}, I meant |
Henrik 20-Apr-2010 [2165x5] | From what I can tell above, we just need a way to represent numbers properly in any base. |
And Pekr is using binaries, because they happen to sort of fit into binary operations in some cases, which makes them look incomplete. | |
How about a base! datatype, which would be a number! ? #(10000000)2 == 127 Quick and probably bad example. An issue would be how to convert between different bases. | |
whoops, 128, it should read. | |
pardon my basic binary skills. :-) | |
Anton 20-Apr-2010 [2170] | I would prefer less syntax, if possible, maybe something like 2_10000000 (Remember we have 2#{10000000}, but it's a binary! of course.) |
Pekr 20-Apr-2010 [2171] | Anton - insane is just that ... insane :-) It is just word. I did not say Carl, BrianH or anyone else is insane, having some arguments. REBOL is not religion, as Brian says ... it is a tool. And I want the tool to work correct way, if possible. So - stop being stressed about someone claiming something is insane :-) The question to all above is - what is the correct behaviour. The qeustion even is - what is correct about correctness? I know from the past, that Carl really cares about simple things being simple. I can't remember the case, but I do remember I pointed out something will confuse ppl, and I was right - we could see the same kind of questions by novice again, and again, and again. I think that if you claim, that to-integer #{8000} allow many interpretations, how is that we have choosen the concrete one? (32768) Because it is what we would expect. You might think that I don't understand what BrianH or Max or You talk about. Whereas only Ladislav got the correct answer for me - if it would hurt to have reverse padded OR operation. |
BrianH 20-Apr-2010 [2172] | The behavior was what I expected, so clearly expectations vary :) |
Pekr 20-Apr-2010 [2173x2] | Steeve: I understand your example, no problem about it, but try to adapt it to my (non-existant yet :-) possible R3 cell phone implementation, which will use 32 bit integers (not sure if it would ever happen). Then, if you want your code being cross platform, your code complicates, no? >>(skip to-binary 1022 6) or #{8000} ==#{8324} |
Andreas - thanks for reminding me we have following form: >> l: 2#{11111110} == #{FE} >> print l #{FE} I just wanted to ask, if it would be possible for interpreter to "preserve" original written format for the output purposes? | |
BrianH 20-Apr-2010 [2175x2] | It might help to have an entry in system/catalog or something that says the length of integers in bytes. Or you could just make your own local constant using like this: >> int-size: length? to-binary 1 == 8 |
Then you can use the constant to make your code portable. | |
Pekr 20-Apr-2010 [2177x2] | BrianH: I know :-) It is just that for the simple purpose of OR, you have to do all those conversions and tests for the integer size. Then original "shortcut" format of #{8000} be better avoided in the code. |
is there possibility we would have 'pad function in REBOL, native, which in the case of binary would auto-padd it to the "full format"? :-) | |
BrianH 20-Apr-2010 [2179x2] | When I am adapring code from languages with C-like integer syntax I resolve the constants to regular integers ahead of time and then put the original syntax in comments. Works great, no conversion overhead at runtime. You should try it. |
A PAD function would be useful. | |
Pekr 20-Apr-2010 [2181] | I think we tried some 'pad efforts in the past, but that function gets easily complicated, as far as our expectations might go ... |
older newer | first last |