r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3-OLD1]

PeterWood
4-Aug-2009
[16485x4]
I think that distinguishing between upper and lower case chars is 
very difficult with Unicode.
Carl seems to have done a great job with Latin characters:

>> uppercase to string! #{C3A0}
== "À"

>> lowercase uppercase to string! #{C3A0}
== "à"

>> uppercase to string! #{C48D}           
== "\u010c"
>> lowercase uppercase to string! #{C48D} 
== "\u010d"
Though don't know what the above will look like in AltME under Windows 
or Mac
Pekr
4-Aug-2009
[16489]
looks good - R with comma upon it ...
PeterWood
4-Aug-2009
[16490x2]
There seems to be some problems with other alphabets though:

>> uppercase to string! #{E382A1}         
== "\u30a1"


\u30A1 is a small katakana letter A. The unicode for a caplital katakana 
A is \u30A2
Pekr - it is actually an a with a grave accent over it in UTF-8
Gabriele
5-Aug-2009
[16492x2]
hmm, should uppercase and lowercase really work with katakana and 
hiragana? the "small" versions have a completely different meaning 
and usage than our "lowercase" has.
it seems to me, that uppercase and lowercase should not modify kana... 
but I haven't read what the unicode standard mandates here.
PeterWood
5-Aug-2009
[16494]
No doubt you are right. I haven't read the unicode standard and know 
nothing about "non-Latin" alphabets.
Pekr
5-Aug-2009
[16495]
BrianH: 'and, 'or, 'xor are allowed logical operations upon typesets. 
Do you think it would be usefull to allow also 'intersect and 'union, 
to allow creation of combinations?
BrianH
5-Aug-2009
[16496]
Already spported, Pekr :)
Pekr
5-Aug-2009
[16497]
ok then. Carl started to write-down A77 changelog, so it means we 
might be close to the first release .... or do you think he will 
merge some other tickets?
BrianH
5-Aug-2009
[16498x2]
We'll see. Part of the reason for the release is to sync his work 
with the rest of the community. So, soon after a77 I can merge my 
fixes of those 15 CureCode tickets I wrote about modules yesterday, 
as well as other mezzanine fixes.
I think it's mostly a sign that we are reaching the end of this tunnel 
on plugins :)
Pekr
5-Aug-2009
[16500]
no, we are starting to see the light at the end of the tunnel :-)
BrianH
5-Aug-2009
[16501x3]
We saw the light last week :)
Hey, I just this moment figured out hot to do JIT-compiled native 
functions for R3 using Carl's new plugin model...
hot -> how
Pekr
5-Aug-2009
[16504]
:-) cool, isn't it?
BrianH
5-Aug-2009
[16505]
If I can get that method to work, that would allow me to work on 
the compiled REBOL dialect that I have been waiting on user-defined 
function types for. This is good news! It moves forward my schedule 
by months :)
Pekr
5-Aug-2009
[16506]
so you don't need u-types anymore?
BrianH
5-Aug-2009
[16507x3]
By the way, the method would not work with R2's library wrapping 
model - it requires the command! model.
I would need u-types for integration with .NET and other systems, 
but not for my compiled functions idea, as long as the compiled functions 
use the same frame-based marshalling interface that the plugin model 
uses.
Since the plugins only export 3 functions from their library, and 
dispatch calls to commands from a single function, I could add new 
commands at runtime as long as that function has some way to make 
sense of their indexes. Then I could make a plugin that wraps a JIT 
compilation library like libjit or libtcc.
Reichart
5-Aug-2009
[16510]
http://tlt.its.psu.edu/suggestions/international/bylanguage/japanese.html


To elaborate on what Gabrielle said, in most languages, there is 
a large and small version of letters for use usually in sentence 
case, and also for abbreviations, etc.  Over time these began to 
be written differently, so the large and small actually look different.


But in Japanese, small letters have a completely separate meaning, 
sometimes used to elongate a sound, or form a subtle guttural stop.

Here is a sample, it is VERY subtle.


http://christopherfield.com/translation/images/hashiriame/story_a.gif


In this image look for all the symbols that look like a backwards 
letter "C" (or letter "U" that fell to the left).

Sentence 1 - 3rd from the right.
Sentence 6 - 3rd from the left.

Notice they are very subtle different sizes.

That is an example.

Bottom line, as stated, don't mess with caps with Japanese.

(it was hard to find a GOOD example of this in the same image).
BrianH
5-Aug-2009
[16511]
This could be an advantage - there are many languages that support 
capitalization as a concept, but many that don't. The ones that don't 
have more characters than the ones that do (I'm guessing). This means 
that we could use smaller tables/code to do the capitalization in 
LOWERCASE and UPPERCASE - valuable space savings for a tiny language 
like REBOL.
Louis
6-Aug-2009
[16512x2]
Is the a replacement for read/lines?
the =  there
Louis
7-Aug-2009
[16514]
I will reword my question.  In R3 how can I read in one line of unicode 
text at a time to process it?
Henrik
7-Aug-2009
[16515]
I think that is not yet implemented. Just ask Pekr. He has talked 
alot about it. :-)
Pekr
7-Aug-2009
[16516x5]
ah, damned, read/lines. What a crap :-)
Well, my only objection was, that I did not agree to read/text, but 
was suggesting more general read/as data 'decoder ....
But BrianH is patient enough to explain me, that 'text operations 
are pretty common, and that they might deserve special treatment.
What I don't like about REBOL, is all those read-text, send-service, 
open-service and other tonnes of mezaninnes. But I think that actually 
I might reconsider my pov, and maybe I would prefer read-text or 
read-csv, which could incorporate tonnes of possible refinements, 
instead of giving 'read special /text refinement .... 'read is too 
low level in R3 ....
IIRC there was also problem with my proposed aproach, that currently 
decoders can't stream (and it really sucks), so that we could get 
double memory consumption - first reading text, then decoding it. 
That is imo why BrianH proposes read/text, to handle it in low level. 
But - I don't like, when architecture flaws are fixed by such workaround. 
Please give me streamed codecs and streamed parse instead ;-)
Sunanda
7-Aug-2009
[16521]
read/lines in R3.....Best we have in R3 so far is DELINE/LINES. Eg 
from the R3 console:
    deline/lines to-string read %user-db.r
BrianH
7-Aug-2009
[16522x3]
READ/text wasn't my proposal, it was Carl's. I often write the CureCode 
tickets for other people's requests, if they are good ones.
Pekr, expect the number of visible mezzanines to go down after the 
module system is fixed. The code is written already, but we are waiting 
for the plugin-related mezzanine changes before the overall module 
system changes can be merged in.
Louis, there may be a solution to your problem that involves direct 
port access, rather than a READ refinement...
Louis
7-Aug-2009
[16525]
Thanks for the feedback, everybody.  Brian, I'll check into direct 
port access.
Graham
8-Aug-2009
[16526x2]
Request ... I would like now/time to always return the seconds.
Edge conditions every exact minute are annoying...
Sunanda
8-Aug-2009
[16528]
Annoying isn't it? Have you submitted a wish to curecode.org?

For now, I use something like this:
    reduce [x: now/time x/hour x/minute x/second]
    ==[12:30 12 30 0]
Graham
8-Aug-2009
[16529]
no, I'd thought I'd solicit opinions first!
Sunanda
8-Aug-2009
[16530]
I think its a good idea!
Graham
8-Aug-2009
[16531]
I just spent a few hours trying to debug someone else's code ... 
and this was the cause.
Henrik
8-Aug-2009
[16532x2]
I agree. Good idea.
however one can use:

to-itime 11:2
== "11:02:00"
Pekr
8-Aug-2009
[16534]
to-integer now-time