r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Core] Discuss core issues

Gregg
23-Mar-2009
[13011x2]
The real problems get you when you start operating on the data. Lanugages 
that use + for string concatenation (which was a big issue in VB, 
even after the & op was added), can produce completely wrong results. 
e.g., what should these produce?

1 + "1"
1
 + 1
1
 + "1"


I think Carl has removed that from R3 at this point, but I know it 
was there in a couple test releases.
All this said, I agree that it's something that should be made as 
simple as possible, but it also needs to be robust.
Geomol
23-Mar-2009
[13013]
You touches some problem, I have, with remembering the difference 
between FORM and TO-STRING. I can't remember what these produce:
form [1 2 3]
to-string [1 2 3]

I have to try it. You have good points. What do you see as the arguments 
to both have = and == in REBOL?
[unknown: 5]
24-Mar-2009
[13014]
at the low level does the REBOL length? function sequentially calculate 
string data when it is called to determine length or does it instead 
access a memory location ot retrieve the current length of a string?
BrianH
24-Mar-2009
[13015]
Based on its speed characteristics, I would say that the length is 
tracked in the string and just accessed by LENGTH?.
Steeve
24-Mar-2009
[13016]
length is read not calculated
[unknown: 5]
24-Mar-2009
[13017]
Ok, that is what I thought Brian.
Sunanda
24-Mar-2009
[13018]
The actual implementation is not know. But given REBOL strings are 
not null terminated, it seems most likely a length is help.
A glance behind the curtain here:

http://www.rebol.org/cgi-bin/cgiwrap/rebol/ml-display-message.r?m=rmlTGVC
[unknown: 5]
24-Mar-2009
[13019]
I didn't see anything addressing it at that link but I assume you 
mean somewhere in that thread.
Steeve
24-Mar-2009
[13020]
uh ? IIRC strings are null terminated in R2
[unknown: 5]
24-Mar-2009
[13021]
Good conversation though.
BrianH
24-Mar-2009
[13022]
Steeve, you can put ^(00) in strings in R2 and R3. They get converted 
to null-terminated for calling C routines, but they aren't internally.
[unknown: 5]
24-Mar-2009
[13023]
But sticking to R2 and putting aside conversion to c routines, are 
you saying that R2 doesn't terminate strings with null?
Steeve
24-Mar-2009
[13024]
hum... internally, strings are null terminated, i don't know the 
use, but it's a fact
BrianH
24-Mar-2009
[13025x2]
Yes.
Steeve, have you disassembled REBOL?
Steeve
24-Mar-2009
[13027x2]
not fully
some parts
[unknown: 5]
24-Mar-2009
[13029x2]
Well the use for terminating the strings would be for determining 
length.  Otherwise you would have to determine length by updating 
the length value on each change of the string.
Which is what I hope is that case actually.
Steeve
24-Mar-2009
[13031x2]
Paul, there are the two of them: length is stored, but strings are 
null terminated
iirc
BrianH
24-Mar-2009
[13033]
Steeve, you can put nulls in the middle of strings and their length 
doesn't change. Thus, not null terminated.
Steeve
24-Mar-2009
[13034]
i know it's not used, but really i saw nulls at the end of strings
BrianH
24-Mar-2009
[13035]
They may have nulls tacked on the end after the tail of the string 
for C compatibility, but REBOL doesn't use them.
Steeve
24-Mar-2009
[13036]
i agree
[unknown: 5]
24-Mar-2009
[13037]
Is there another form of termination that might not be nulls that 
REBOL uses?
BrianH
24-Mar-2009
[13038x2]
Otherwise length? would be O(n), and my testing has found it to be 
O(1) for avery series type except list!
REBOL uses something like Pascal strings with nulls tacked on the 
end for C compatibility. The length is tracked on write.
[unknown: 5]
24-Mar-2009
[13040x2]
Well, if REBOL allocates storage for a string then there might be 
a maxlength value stored along with a length value.
Such that length? is returning current length.
Steeve
24-Mar-2009
[13042x3]
A string use several slots of 16 bytes length.
by default, an empty string uses one slot (16 bytes).

If you define a string of 16 bytes length. then, the string uses 
2 slots because the null char at the end of the string can fit in 
the first slot.
*(can't fit)
They are continous slots of course
BrianH
24-Mar-2009
[13045]
Are you referring to R2 or R3? R3's strings are different.
Steeve
24-Mar-2009
[13046]
R2
BrianH
24-Mar-2009
[13047]
Right.
[unknown: 5]
24-Mar-2009
[13048]
I'm curious why your saying it is 16 bytes.
BrianH
24-Mar-2009
[13049]
Strings are preallocated in multiples of 16 bytes, probably to simplify 
the GC.
[unknown: 5]
24-Mar-2009
[13050]
Oh this is a REBOL thing.
Steeve
24-Mar-2009
[13051]
all values in Rebol are chunked into slots of 16 bytes length
[unknown: 5]
24-Mar-2009
[13052x2]
Got ya.
Seems a waste.
Steeve
24-Mar-2009
[13054]
all series data type are using continuous slots
BrianH
24-Mar-2009
[13055]
Memory management. Even the stack is a block.
[unknown: 5]
24-Mar-2009
[13056]
I'm just thinkg of the memory storage.  It seems a waste from that 
persective.
Steeve
24-Mar-2009
[13057x3]
we no that, it's a choice
*we know
speed vs memory
BrianH
24-Mar-2009
[13060]
All memory management systems have a little waste. You have to balance 
the memory overhead versus the CPU overhead.