r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3-OLD1]

Henrik
11-Sep-2009
[17458x2]
console is going to be down for a bit while I do a rewrite of the 
console code.
thanks for testing, everyone
Pekr
11-Sep-2009
[17460x3]
I am trying to do simple CGI tests, using Cheyenne, and in reference 
to following blog: http://www.rebol.net/r3blogs/0182.html

For: http://localhost/show.cgi?test- I do get:

Content-type GE te 127.0 


Why always only 2 bytes? Is that actually two bytes? I would say 
- two "elements"

The code is:

#!c:\!rebol\altme\worlds\r3-gui\files\rebdev\view.exe -q

REBOL [
    Title:      "show"
    File:       %show.cgi
]

print "Content-type: text/html^/"
print get-env "REQUEST_METHOD"
print get-env "QUERY_STRING"
print get-env "REMOTE_ADDR"
print newline

Is that R3 problem, or Cheyenne problem?
First thing I wonder about is - when you use function 'usage, it 
reports --cgi is available. Is it internally utilised at all?
Hmm, using no command line option, I do get:

Checking for rebol.r file in /c/!reb Evaluating: Content-type GE 
te 127.0 


The output is strangerly stripped too ... like R3 print output would 
be stripped ...
Dockimbel
11-Sep-2009
[17463]
You should start your kernel with --cgi in order to activate CGI 
related processing. I don't know if it is supported by R3 alphas 
yet.
Pekr
11-Sep-2009
[17464]
it does just the same as -q option, as per above blog article. If 
I provide no command option, I get strange output :-)


Checking for rebol.r file in /c/!reb Evaluating: Content-type make 
object! [ request-method: "GET" que
Maxim
11-Sep-2009
[17465]
I think --cgi option won't be needed in R3 since the I/O seems to 
be much closer to the shell and get-env actually is available, removing 
the need for the cgi handling by the script on launch.
Pekr
11-Sep-2009
[17466x2]
... Maybe BrianH will know, what is going on ....
Max - correct, but why the output is stripped. And using no command 
line option - what about above debugging info? Why is R3 telling 
me, it looks for rebol.r file? :-)
Maxim
11-Sep-2009
[17468x5]
you should be able to redirect only the std err which shouldn't dumped 
to console... this is why std error exists.  works in R2 IIRC, don't 
know if the std err is accessible directly in R3 though.
probably because it spits out unicode which has 0 bytes in them.
might want to try converting the strings to binary before printing.
optionally encoding them in ascii first... http headers are ascii.
0 bytes meaning, bytes with the value "0". which act as null terminators 
in C land.
Dockimbel
11-Sep-2009
[17473]
Issue reproduced here, it seems related to unicode strings output 
by your script.
Pekr
11-Sep-2009
[17474]
hmm, strange. What can I do about it? IE displays chars correctly, 
the output in FF is weird, and I can't correct it by changing charset 
to any other setting ...
Dockimbel
11-Sep-2009
[17475]
11/9-16:01:10.375-[uniserve] Output =>
{C^@o^@n^@t^@e^@n^@t^@-^@t^@y^@p^@e^@
G^@E
n^@o^@
1^@2^@7^@.^@0


}
Pekr
11-Sep-2009
[17476]
C?o?n?t?e?n?t?-?t?y?p?e? m?a?k?e? ?o?b?j?e?c?t?!? ?[? ? ? ? ? ?r?e?q?u?e?s?t?-?m?e?t?h?o?d?:? 
?"?G?E?T?"? ? ? ? ? ?q?
Maxim
11-Sep-2009
[17477]
the header MUST be printed out in ASCII.
Pekr
11-Sep-2009
[17478x2]
Do we have any string encodings in R3 already?
Max - following blog does not imply that. Why should I do it on my 
localhost? It properly knows the codepage, etc.? http://www.rebol.net/r3blogs/0182.html
Maxim
11-Sep-2009
[17480x3]
it being so old, its possible the decault encoding was still askin 
at that point.
askin = ascii
AFAIK unicode -> ascii is possible in R3 but don't know how... not 
having done it myself.  IIRC its on the R3 wiki or docs pages somehow.... 
googling it should give you some clues.
Pekr
11-Sep-2009
[17483x2]
REBOL 3.0 accepts UTF-8 encoded scripts, and because UTF-8 is a superset 
of ASCII, that standard is also accepted.

If you are not familiar 
with the UTF-8 Unicode standard, it is an 8 bit encoding that accepts 
ASCII directly (no special encoding is needed), but allows the full 
Unicode character set by encoding them with characters that have 
values 128 or greater.
It should accept Ascii directly ....
Maxim
11-Sep-2009
[17485x4]
that's on input.
print spits out unicode.
AFAIK
string! printing, to be more precise.  UTF and ASCII are  converted 
to two byte strings IIRC.  which is why you must re-encode them before 
spitting them via print.
Pekr
11-Sep-2009
[17489]
see the system/catalog/codecs for a list of loaded codecs

 - hmm, docs need an update. Dunno why the section was moved to system/codecs 
 ... will ask on R3 chat ...
PeterWood
11-Sep-2009
[17490]
Max - I believe that Carl has written sone tricky string code and 
strings can be either single or double byte depending on their content.
Maxim
11-Sep-2009
[17491]
possible, but I've always seen them output as double byte...  this 
topic has come around a few times in the last months
PeterWood
11-Sep-2009
[17492]
Running R3 from the Mac terminal the output from the print function 
is definitely utf-8 encoded.
Pekr
11-Sep-2009
[17493]
I tried to look-up some codecs, but there are none for text encodings 
as of yet:

SYSTEM/CODECS is an object of value:
   bmp             object!   [entry title name type suffixes]
   gif             object!   [entry title name type suffixes]
   png             object!   [entry title name type suffixes]
   jpeg            object!   [entry title name type suffixes]
PeterWood
11-Sep-2009
[17494]
I think that to binary! will decode a Rebol string! to utf-8 :

>> to binary! "^(20ac)"  ;; Unicode code point for Euro sign     
== #{E282AC} ;; utf-8 character sequence for Euro sign
Maxim
11-Sep-2009
[17495x3]
maybe peter's excellent encoding script on rebol.org could be used 
as a basis for converting between ascii -> utf8  when using R3 binary 
 as an input.  while R3 has them built-in
while = until
sort of like:

print to-ascii to-binary "some text"
Pekr
11-Sep-2009
[17498]
I don't want to encode anything for simple CGI purposes, gee ;-)
Maxim
11-Sep-2009
[17499x2]
but R3 is now fully encoded, which is REALLY nice.  you don't have 
a choice.  Resistance is futile  ;-)
and the fact that binary gives us the real byte array without any 
automatic conversion is also VERY nice, for building tcp handlers... 
it would have made my life much simpler in the past in fact.
Pekr
11-Sep-2009
[17501x2]
But this is some low level issue I should not care about. It displays 
Czech codepage correctly. Also the script is said being by default 
UTF-8, which is superset to ASCII. IIRC it was said, that unless 
we will not use special chars, it will work transparently. If it 
works on input, it should work also on output, no?
OK, so we have http headers, which are supposed to be in ASCII, and 
then html content, which can be encoded. Which responsibility is 
it to provide correct encoding? A coder, or an http server? Hmm, 
maybe coder, as I am issuing http content headers in my scripts?
PeterWood
11-Sep-2009
[17503]
Pekr: Just try a quick test with: 
 print to binary! "Content-type: text/html^/"
 print to binary! get-env "REQUEST_METHOD"
 print to binary! get-env "QUERY_STRING"
 print to binary! get-env "REMOTE_ADDR"

to see if it is an encoding problem.
Pekr
11-Sep-2009
[17504x2]
I think I tried, but it printed binaries ...
#{436F6E74656E742D74797065 #{474 #{ #{3132372E3 #{0
Maxim
11-Sep-2009
[17506]
but the loading actually does a re-encoding.  utf-8 is compact, buts 
its slow because you cannot skip unless you traverse the string char 
by char.  which is why they are internally converted to 8 or 16 bit 
unicode chars... it seems strings become 16 bits a bit too often 
(maybe a change in later releases, where they are always converted 
to 16 bits for some reason).
PeterWood
11-Sep-2009
[17507]
The content of the binaries are fine but their format is a probelm. 
Sorry, I forgot about that when I suggested to try them.