• Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r4wp

[#Red] Red language group

Pekr
26-Sep-2012
[2199]
well, anyway - how is R2 being able to read utf-8 anyway?
DocKimbel
26-Sep-2012
[2200]
It reads it as a stream of bytes. As UTF-8 doesn't use null bytes 
in its encoding (except for codepoint 0), it can be fully loaded 
as string! or binary! in R2 (but you'll see garbage for non-ASCII 
characters).
PeterWood
26-Sep-2012
[2201]
If anybody can provide the UTF-8 chars (hex values)  for Hello World 
in Czech. I'll run a test.
DocKimbel
26-Sep-2012
[2202x3]
Peter: should be "Dobr^(C3)^(BD) den sv^(C4)^(9B)t"
I've just tested it on Windows console (using Consolas font), it 
works fine.
The above string doesn't work as-is in Red though, you should pass 
the codepoints escaped instead of the UTF-8 encoding.
PeterWood
26-Sep-2012
[2205]
I noticed :-)
DocKimbel
26-Sep-2012
[2206]
I haven't implemented full char! support yet, so I can't write a 
Red script to print me the right codepoint values...(char! will be 
implemented later today though).
Jerry
26-Sep-2012
[2207]
Hello

 in many languages http://www.omniglot.com/language/phrases/hello.htm
DocKimbel
26-Sep-2012
[2208x2]
Good source, even Klingon is there! :-)
Hello in Klingon:

nuqneH (What do you want?)
- used when confronted by another
PeterWood
26-Sep-2012
[2210]
Code points are 00FD & 011B
DocKimbel
26-Sep-2012
[2211x2]
I really need to learn more, my Klingon is currently limited to " 
Qapla' " only (means Goodbye).
Thanks Peter. It works for me with "Dobr^(FD) den sv^(011B)t"
Pekr
26-Sep-2012
[2213]
Above works ... but when I write it directly in Notepad (and the 
file claims it is UTF-8), it does not work ... strange then ...
Henrik
26-Sep-2012
[2214]
Not sure if Notepad is the best for UTF-8 work...
DocKimbel
26-Sep-2012
[2215x3]
Pekr: it might be a BOM issue with Red loader, I don't remember testing 
it...
Pekr: try to set the "encoding" field to UTF-8 in the saving panel 
(Save as...).
Here, I have no issue using Notepad to write Red Unicode scripts.
Pekr
26-Sep-2012
[2218]
it is set to UTF-8 already ....
DocKimbel
26-Sep-2012
[2219]
You can still download Notepad++ (or any other text editor with decent 
Unicode support). I have to drop my good old TextPad as it doesn't 
have good Unicode support.
Pekr
26-Sep-2012
[2220x5]
I will do some more testing later ...
There are definitely some bugs somewhere ...
where can I find codepage for our chars, so that I can give you escaped 
value to try? some chars of czech extended alphabet are OK, some 
are not ...
above mentioned C4 9B is cuasing following output at the end of the 
phrase - strange ....

http://www.xidys.com/pekr/red/red-unicode-bug.jpg
I used no punctuation char, just "e", then changed it to "e" (with 
hook above it), and it added that strange chars to the end of the 
string ...
DocKimbel
26-Sep-2012
[2225x2]
Pekr: you should look at our more recent posts, the first string 
I posted had the wrong codes. The right string is:

Dobr^(FD) den sv^(011B)t
You can find the codepoints you need here: http://en.wikipedia.org/wiki/List_of_Unicode_characters
Ladislav
26-Sep-2012
[2227]
Doc, a minor nitpicking: it is a vocative and thus the correct spelling 
should be

Dobr^(FD) den sv^(011B)te
Pekr
26-Sep-2012
[2228x3]
OK, but why do I need those codes in the first place?
In R3, if script is in the UTF-8 format, I can imo directly type 
it in Notepad ...
... but - it is a long time I tried it, so not sure ...
DocKimbel
26-Sep-2012
[2231x2]
Pekr: I'm giving you the codes because it seems something is going 
wrong with your editor. In my Notepad or any other decent text editor 
here, I can type (or copy/paste them from the web) without any issue.
Ladislav: nitpicking accepted in such case. :-)
Pekr
26-Sep-2012
[2233x2]
Doc - I can type without any issue, it is just that it does not display 
correctly in my console :-)
So, give me an email, I will send the exact script to you for you 
to try ....
DocKimbel
26-Sep-2012
[2235x2]
[nr-:-red-lang-:-org]
You should have zipped the hello.red script to preserve its original 
encoding.
Pekr
26-Sep-2012
[2237]
done
DocKimbel
26-Sep-2012
[2238]
I've reproduced your issue
Pekr
26-Sep-2012
[2239]
cool :-) So what's the issue about?
DocKimbel
26-Sep-2012
[2240x4]
Don't know, looking into it...the ! character seems correctly encoded.
Might be a corruption bug.
...or rather a null character termination bug.
Pekr: I've fixed several bugs thanks to your HelloWorld, thank you. 
:-) Let me know if it's fine with the new commit.
Pekr
26-Sep-2012
[2244]
partially fixed, but my suspicion is, there might be other related 
problems. Will explain - I replaced "Dobry den" (good day) with "Vitej" 
(Welcome), with the comma above "i" char. Or try with just ordinary 
"i". Simply put - insert following line after the Czech HelloWorld:

print "Vitej svete" 

here it prints onle "Vitej sv"
PeterWood
26-Sep-2012
[2245]
It works for me on OSX and Windows 7
Pekr
26-Sep-2012
[2246]
uh, in above line, in the word "svete", the first "e" has hook above 
it, like in the original HelloWorld you used for czech ...
Ladislav
26-Sep-2012
[2247]
By "hook" he means the "caron" diacritic, also known as "hacek", 
"wedge", "inverted circumflex" or "inverted hat".
Pekr
26-Sep-2012
[2248]
sent new file to Doc. Doc - btw - the last sentence (which prints 
correctly), means something like "Way too much yellow horse groaned 
devilish odes" :-) That sentece is known in our language to test 
all possible special chars in Czech language and still have some 
meaning ...