• Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r4wp

[#Red] Red language group

DocKimbel
7-Dec-2012
[4610]
Closed-source: I guess we are now all well-aware how disastrous it 
can be. Open-source is a superior way of building good and long-lasting 
software, even investors can understand that. :-)
Steeve
7-Dec-2012
[4611]
Be careful Doc, You may bring back not only chinese investors but 
also some pretty girls
Pekr
7-Dec-2012
[4612]
:-D
DocKimbel
7-Dec-2012
[4613]
Steeve: pretty girls are not restricted to China. :-)
Steeve
7-Dec-2012
[4614x2]
Still, one know guys who have gone to distant Asia as single and 
came back married.
It's a well known trap ;-)
Kaj
7-Dec-2012
[4616x2]
Yeah, almost all my cousins
I like the # suffix proposal. It stands out better than the h suffix 
and looks more in line with REBOL
Steeve
7-Dec-2012
[4618x2]
Arghhh! My first time compiling something to Red:
-= Red Compiler =-
Compiling red/tests/sorting.red ...
*** Red Compiler Internal Error: Script Error : copy expected ran
ge argument of type: number series port pair
*** Where: process
*** Near:  [stack/push to type copy/part s]
I ran %run-all at first and it was alright
DocKimbel
7-Dec-2012
[4620x3]
Looks like an issue with the changes I did for ticket https://github.com/dockimbel/Red/issues/319
You can try to compile it with -v 9 option to better locate the line 
that trigger the error.
Could you post your script there if short enough (else just post 
it to me privately or on the bugtracker)?
Steeve
7-Dec-2012
[4623]
Actually I know there are errors in the script since I did not try 
to translate it from R3 to Red. But I was expecting something else 
for a starter
DocKimbel
7-Dec-2012
[4624x2]
It would be nice if we could fix that compilation bug before the 
new release.
Hmm, sorry it is not related to #319, it looks more like a lexer 
issue. You are probably passing a datatype that is not yet implemented 
in Red.
Steeve
7-Dec-2012
[4626x2]
I cut my script and got the same error with just the following:

Red []
;*** Bottom-up-heapsort ***
heapify: func [s start len comp /local step sav inc][
	inc: 0
	sav: s/:start
	;-- search terminal leaf.
	step: start
	while [len > step: 2 * step][
		++ inc
		unless comp s/(++ step) s/:step [-- step]
	]
	either step = len [++ inc][step: shift step -1]
	;-- bottom-up, search insertion point
	loop inc [
		unless comp s/:step sav [break]
		step: shift step -1
		-- inc
	]
	;-- bottom-up swap
	loop inc [						;-- chain swap
			s/:step: also sav sav: s/:step
			step: shift step -1
	]
	s/:step: sav
]
I have not tried to translate anything... T_T
DocKimbel
7-Dec-2012
[4628x3]
The lexer is choking on get-word used in path...let me see that...
Actually, it's blocking on s/(++ step), such syntax should be supported 
by the lexer, so there's a bug there.
Steeve: I have fixed the lexer bug, so it should at least load correctly 
now. But paren! in path are not yet compiled, so you'll get a "feature 
not implemented" at compilation.


Also, passing a function as argument is not yet correctly handled. 
Also I'm unsure if s/:step: will be compiled correctly, as we haven't 
yet much tests for path accesses.
Kaj
7-Dec-2012
[4631]
All examples compile without warnings now
DocKimbel
7-Dec-2012
[4632x2]
Thanks!
(for testing)
Gregg
7-Dec-2012
[4634]
For hex notation in REBOL, I've used (albeit dynamically) a simple 
HEX function with issues. 

  hex #20000001


I'm OK with the suffix approach, but if a prefix approach works I 
like that the prefix clues you in to what you're reading, rather 
than reading the number and then seeing the suffix. The question 
is what sigil to use, if lexical space becomes very tight, as in 
REBOL. Do you have any plans for &?

  &HFFFF000F
  &O77770007  ; though I don't think we need octal
  &B11110001
Maxim
7-Dec-2012
[4635]
the following are currently invalid REBOL notations (the first three 
load in R2 but get scrambled)

I prefer the first tree, since they are pretty obvious without any 
knowledge of the language.

16#FFFF000F
8#7124554764
2#0110110101

H#FFFF000F
O#7124554764
B#0110110101
Gregg
7-Dec-2012
[4636]
I like having the numbers in binary! values, but not as much for 
this. My brain says "this is a binary in base 16 notation", but for 
hex or binary literals, I want to think of the words 'hex and 'binary, 
rather than "this is a base-16 number, which means it's in hex format". 
I think I looked for alternate notations a long time ago. Have to 
see if I can find my notes.
DocKimbel
7-Dec-2012
[4637]
I have found an issue with word! value casing in Red. The Red/System 
code generated for:
	print 'a = 'A
is:
          stack/mark-native ~print
          stack/mark-native ~strict-equal?
	word/push ~a
          word/push ~A
          natives/strict-equal?*
          stack/unwind
          natives/print*
          stack/unwind


The problem is that Red/System is case-insensitive, so ~a and ~A 
are the same variable. So, no way to make it work like that. I see 
two options for solving it:

1) Make Red/System case-sensitive.

2) Deep encode each Red generated symbol to distinguish lower and 
uppercases.


Solution 2) works, but it makes symbol decoration operation very 
costly (each symbol letter is prefixed with a sigil for lowercases 
and another one for uppercases). The example above becomes:

          stack/mark-native ~_p_r_i_n_t
          stack/mark-native ~_s_t_r_i_c_t_-_e_q_u_a_l_?
          word/push ~_a
          word/push ~-A
          natives/strict-equal?*
          stack/unwind
          natives/print*
          stack/unwind


So, it is not nice, it doubles every Red symbol size that is handled 
by Red/System and slows down Red compilation by 25%.

So, my questions are:
a) Does anyone see another cheaper solution to this problem?

b) In case of option 1), do you have anything against making Red/System 
identifiers case-sensitive?
Kaj
7-Dec-2012
[4638]
Hm, I like that Red/System is case-insensitive like REBOL, so I would 
consider it an offer to have to let go of that
DocKimbel
7-Dec-2012
[4639x3]
Hmm, actually, another option should be possible, generating a unique 
new symbol for same words that have different casing. I will test 
it tomorrow. Anyway, if you have ideas/remarks about this, let me 
know.
Anyway, I don't think we use different casing for identifiers in 
Red/System. Even in REBOL, I don't remember ever using same words 
with different casing in the same app.
I would like to fix this issue and make words comparison operators 
work for the new release, so I'll postpone the release for tomorrow.
Gregg
7-Dec-2012
[4642x3]
Do you know how REBOL handles it? I prefer case-insensitive in general, 
but doubling the size of identifiers seems bad, even if hidden from 
us for the most part.
Case-sensitivity could trip up a lot of REBOLers. I know this is 
Red/System, but still. You may also find that people treat it as 
a feature and start giving things names that differ only in case, 
as happens in C.
What are the biggest downsides to having Red/System remain case-insensitive? 
That is, what does case sensitivity buy us?
Kaj
7-Dec-2012
[4645x4]
In REBOL, 'a and 'A are aliases of the same symbol. Red/System converts 
them to their integer identifier, right? I'd say you need different 
identifiers for aliases somehow to implement the REBOL semantics 
of distinguishing equal? and strict-equal?
That is, identifiers need two levels: the first level for identifying 
the symbol, and the second level for distinguishing aliases
The most space efficient encoding I can come up with would be something 
like ~a-1 for 'a and ~A-2 for 'A. That would be cheap to evaluate 
for strict-equal? but expensive for equal?
A faster encoding would be to reserve a part of the integer identifier 
for the alias number, for example one byte. That would reduce the 
number of different symbols to 2^24 and the maximum number of aliases 
for one symbol to 256. That would only allow a word up to 8 characters 
to have all its aliases, but it would be cheap to evaluate for both 
strict-equal? and equal?
DocKimbel
8-Dec-2012
[4649x5]
In REBOL, 'a and 'A are aliases of the same symbol. Red/System converts 
them to their integer identifier, right?


Symbols have two representations in Red compiler, one is at runtime 
(like in REBOL), the other is a compile-time, in the form of Red/System 
variables. In a very early version of the compiler, I was using integers 
(indexes in symbol table) instead of variables, but quickly realizef 
that it was obfuscating the generated Red/System code a lot, making 
it difficult to debug. Also, the integer approach had an additional 
runtime cost at it required to make an array access in order to retrieve 
the symbol value.


Currently, the Red/System ~<name> variables directly point to a word! 
value version, instead of a symbol! for simplicity and efficiency.
I have implemented a compile-time aliasing system for same words 
but different casing. It works fine so far and is cheap compared 
to other options (it requires a conversion table (symbol->alias) 
to be maintained during the compilation).
Aliases are already implemented in the symbol! type. Basically a 
word! relies on a symbol ID, which is an entry in the symbol table. 
Each entries in this table is a symbol! value that references the 
internal Red string! value and a possible alias ID (which is just 
another symbol ID).


Now, I just need to add alias handling in the equal? and strict-equal? 
natives when applied on words to make it work correctly.
What are the biggest downsides to having Red/System remain case-insensitive? 
That is, what does case sensitivity buy us?


Good question. I think it doesn't buy us anything nor does it remove 
us any useful feature. Actually, I think that as long as you are 
consistent in the way you name your identifiers (variables, functions, 
contexts,...), you are case-neutral. So, having Red/System case-sensitive 
wouldn't change anything for me and I guess it would be the same 
for others.


Anyway, I prefer to keep it case-insensitive for now, for the sake 
of consistency with Red, unless I really need to change it.
Ok, now equality comparison operators work on all word datatypes.
Gregg
8-Dec-2012
[4654]
Thanks Doc. This is good information to put in a doc somewhere, even 
if just as a reminder to formally doc it later.
BrianH
8-Dec-2012
[4655x2]
Why would = translate to strict-equal? - shouldn't that be == instead?
This is one area where copying R3 as it is now would be a bad idea 
though. See http://issue.cc/r3/1834for details.
DocKimbel
8-Dec-2012
[4657x2]
Brian: wrt '=, it's a typo, it should be ==.
I haven't implemented EQUIV? yet, I'll look at it when we'll have 
a complete IEEE-754 support (we are missing INFs and NaN handling 
in Red/System).
Marco
8-Dec-2012
[4659]
About hex notation etc (I like case insensitiveness for numbers):
0&a1B
0%10110
or
0b10110
0ha1B