r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3]

BrianH
19-Apr-2010
[2132x8]
Pekr, you do realize that TO-INTEGER #{8000} is a conversion of an 
incomplete binary, an operation, right? And that 0x8000 is syntax 
for an integer value? REBOL doesn't have hex syntax for integers, 
or any default interpretation of binary values as being of a different 
datatype. Just like Python doesn't have syntax for binary values 
(unless I'm mistaken about that last bit).
#{8000} could just as easily be an incomplete decimal or money, or 
a malformed character, as it could an incomplete integer.
#{8000} is a concrete value. It just isn't an integer.
#{8000000000000000} is the binary equivalent to an integer, though 
not without explicit conversion.
Any padding is done during the conversion, but that is only for your 
convenience. It is not an inherent quality of the source value.
All of the TO whatever binary! conversions also allow the binary 
to be longer than the target value, ignoring the rest of the data. 
This comes from the assumption that the binary is a stream that you 
are converting and the rest of the stream is other values that you 
will be converting later. If the value is too short then it is assumed 
that you did a COPY/part on the stream for alignment and padding 
purposes, so it will be nice to you, but direct operations on binaries 
are assumed to have comparable lengths. And there are no implicit 
conversions to or from binaries, as a rule. The behavior is very 
consistent.
>> to-binary -0.0
== #{8000000000000000}
>> to-binary -9223372036854775808
== #{8000000000000000}


So, what does #{8000000000000000} mean? It means #{8000000000000000}, 
nothing more without explicit conversion.
Out of curiosity, does Python have a literal syntax for an array 
of bytes? That would be the equivalent of binary!.
Andreas
19-Apr-2010
[2140x2]
Yes, it has.
Python3, that is: b'....'.
BrianH
19-Apr-2010
[2142]
Thanks, now at least we have something to compare :)
Andreas
19-Apr-2010
[2143]
And in Python 2, a (non-unicode-) string is nothing but an array 
of bytes.
BrianH
19-Apr-2010
[2144]
Not that dissimilar to R2, that.
Andreas
19-Apr-2010
[2145]
Quite similar in fact, except for the default encoding.
Anton
19-Apr-2010
[2146]
Pekr, I agree with BrianH (as I almost always do).

It seems the confusion is that C or Python's integer representation 
syntax and Rebol's binary datatype look similar, because they both 
use hexadecimal. But 0x8000 in C or Python is really an integer and 
has to fit inside an integer type with some specific size. Rebol's 
binary type is a series type and can be as long or short as you want 
(well, measured in 8-bit chunks, octets), and it doesn't make any 
assumptions as to what the meaning of those octets is.

In Rebol I miss being able to represent integers the way C does, 
it makes translation a bit more difficult.
Andreas
19-Apr-2010
[2147]
And just to add my two cents: I find the current R3 behavior to be 
easily understandable and totally sane.
Anton
19-Apr-2010
[2148]
I haven't really gotten into R3 yet. I only played with binary/integer 
conversions in R2.
Andreas
19-Apr-2010
[2149x2]
Things are much better in R3 :)
Imagine TO-BINARY integer! just working like it's supposed to :) 
Heaven!
Anton
19-Apr-2010
[2151]
Yes, that does look better :)
Andreas
19-Apr-2010
[2152x4]
So just to add what Brian already explained nicely above: #{8000} 
is syntax for a sequence of bytes. The equivalent in Python would 
be b'\x80\x00'. 32768 is syntax for an integer, same in Python. Additionally, 
Python has alternative syntax for the same integer: 0x8000, 0o10000, 
0b1000000000000000; for those literals, REBOL has no corresponding 
syntax.
And I don't think those literals are missed, in REBOL.
As you can already glance from the above, we have a nicer literal 
way to specify sequences of bytes. This goes even further: we can 
write #{F0} as 2#{11110000}, for example.
And the only important thing for correctly using binaries, is not 
to mistake a binary for some other type just because it could validly 
represent this other type. #{0001} is neither 1, nor true, nor $1, 
nor #1, nor ...
Pekr
20-Apr-2010
[2156x3]
You still talk about the syntactic sugar, but that is imo irrelevant. 
If we have so cool and open dtype, which can represent "anything", 
then allowing integer to binary and vice-versa conversion, should 
be prohibited! How is that we decided, that to-integer #{8000} is 
actually 32768? That is absolutly concrete value, it does not allow 
any other interpretation, no? So how is that, when ORring with such 
a value, such expectation does not stand anymore?
I still think, that OR/AND applied from the right side would not 
hurt anyone. It would just work correctly imo. Max's explanation, 
that binary is just a stream does not imo stand any valid argument 
here, because - when you already decide to apply AND/OR, you decide 
at certain time, with certain known binary value,no matter wher in 
the stream you are ...
So once again ... the following result is imo insane, and I wish 
a luck anyone trying to explain it in the docs :-)

>> l: to-binary 1022
== #{00000000000003FE}

>> l or #{8000}
== #{80000000000003FE}
Anton
20-Apr-2010
[2159x3]
You just ORed together two binaries. How the hell is rebol supposed 
to know that you consider them to represent 64-bit integers? The 
above expression does not specify anywhere that these are to be treated 
as 64-bit integers.
to-integer #{8000} -> 32768  "... it does not allow any other interpretation, 
no?"   --- Yes, it could easily allow many other interpretations, 
for example, the first byte of the binary could become the least 
significant 8 bits of the integer, instead of the most significant 
8 bits as it is now.
Using language like "insane" to describe the above is hyperbole. 
I don't respect that; it's an overreaction. Why not say instead, 
"the following result is still confusing to me". (I'm having the 
same problem with my parents and with myself, we are using too much 
extreme inaccurate language, and it's causing emotional stress between 
us.)
Steeve
20-Apr-2010
[2162]
Wow, what a fuss :)

What prevents you to adjust the size of the binary to fit the chosen 
size ?

>>(skip to-binary 1022 6) or #{8000}
==#{8324}
Ladislav
20-Apr-2010
[2163]
Pekr, you are right when saying, that binary OR uses different padding 
(right) than conversion to integer (left).

As far as I am concerned, the left padding for conversion to integer 
looks more convenient than right padding, "producing" a useful result 
more often.

Regarding the OR operation: I guess, that it *could* use left padding 
too, but, in that case, I am not sure, whether left padding would 
produce a more useful result more often, than right padding. (although 
you provided one case, where left-padding *would* be more useful) 
Nevertheless, I do not think, that padding of these two operations 
has to be the same
Steeve
20-Apr-2010
[2164]
#{83FE}, I meant
Henrik
20-Apr-2010
[2165x5]
From what I can tell above, we just need a way to represent numbers 
properly in any base.
And Pekr is using binaries, because they happen to sort of fit into 
binary operations in some cases, which makes them look incomplete.
How about a base! datatype, which would be a number! ?

#(10000000)2 == 127


Quick and probably bad example. An issue would be how to convert 
between different bases.
whoops, 128, it should read.
pardon my basic binary skills. :-)
Anton
20-Apr-2010
[2170]
I would prefer less syntax, if possible, maybe something like 2_10000000
(Remember we have 2#{10000000}, but it's a binary! of course.)
Pekr
20-Apr-2010
[2171]
Anton - insane is just that ... insane :-) It is just word. I did 
not say Carl, BrianH or anyone else is insane, having some arguments. 
REBOL is not religion, as Brian says ... it is a tool. And I want 
the tool to work correct way, if possible. So - stop being stressed 
about someone claiming something is insane :-) 


The question to all above is - what is the correct behaviour. The 
qeustion even is - what is correct about correctness? I know from 
the past, that Carl really cares about simple things being simple. 
I can't remember the case, but I do remember I pointed out something 
will confuse ppl, and I was right - we could see the same kind of 
questions by novice again, and again, and again.


I think that if you claim, that to-integer #{8000} allow many interpretations, 
how is that we have choosen the concrete one? (32768) Because it 
is what we would expect. You might think that I don't understand 
what BrianH or Max or You talk about. Whereas only Ladislav got the 
correct answer for me - if it would hurt to have reverse padded OR 
operation.
BrianH
20-Apr-2010
[2172]
The behavior was what I expected, so clearly expectations vary :)
Pekr
20-Apr-2010
[2173x2]
Steeve: I understand your example, no problem about it, but try to 
adapt it to my  (non-existant yet :-) possible R3 cell phone implementation, 
which will use 32 bit integers (not sure if it would ever happen). 
Then, if you want your code being cross platform, your code complicates, 
no?

>>(skip to-binary 1022 6) or #{8000}
==#{8324}
Andreas - thanks for reminding me we have following form: 
>> l: 2#{11111110}
== #{FE}

>> print l
#{FE}


I just wanted to ask, if it would be possible for interpreter to 
"preserve" original written format for the output purposes?
BrianH
20-Apr-2010
[2175x2]
It might help to have an entry in system/catalog or something that 
says the length of integers in bytes. Or you could just make your 
own local constant using like this:
>> int-size: length? to-binary 1
== 8
Then you can use the constant to make your code portable.
Pekr
20-Apr-2010
[2177x2]
BrianH: I know :-) It is just that for the simple purpose of OR, 
you have to do all those conversions and tests for the integer size. 
Then original "shortcut" format of #{8000} be better avoided in the 
code.
is there possibility we would have 'pad function in REBOL, native, 
which in the case of binary would auto-padd it to the "full format"? 
:-)
BrianH
20-Apr-2010
[2179x2]
When I am adapring code from languages with C-like integer syntax 
I resolve the constants to regular integers ahead of time and then 
put the original syntax in comments. Works great, no conversion overhead 
at runtime. You should try it.
A PAD function would be useful.
Pekr
20-Apr-2010
[2181]
I think we tried some 'pad efforts in the past, but that function 
gets easily complicated, as far as our expectations might go ...