World: r3wp
[!REBOL3-OLD1]
older newer | first last |
PeterWood 4-Aug-2009 [16485x4] | I think that distinguishing between upper and lower case chars is very difficult with Unicode. |
Carl seems to have done a great job with Latin characters: >> uppercase to string! #{C3A0} == "À" >> lowercase uppercase to string! #{C3A0} == "à" >> uppercase to string! #{C48D} == "\u010c" | |
>> lowercase uppercase to string! #{C48D} == "\u010d" | |
Though don't know what the above will look like in AltME under Windows or Mac | |
Pekr 4-Aug-2009 [16489] | looks good - R with comma upon it ... |
PeterWood 4-Aug-2009 [16490x2] | There seems to be some problems with other alphabets though: >> uppercase to string! #{E382A1} == "\u30a1" \u30A1 is a small katakana letter A. The unicode for a caplital katakana A is \u30A2 |
Pekr - it is actually an a with a grave accent over it in UTF-8 | |
Gabriele 5-Aug-2009 [16492x2] | hmm, should uppercase and lowercase really work with katakana and hiragana? the "small" versions have a completely different meaning and usage than our "lowercase" has. |
it seems to me, that uppercase and lowercase should not modify kana... but I haven't read what the unicode standard mandates here. | |
PeterWood 5-Aug-2009 [16494] | No doubt you are right. I haven't read the unicode standard and know nothing about "non-Latin" alphabets. |
Pekr 5-Aug-2009 [16495] | BrianH: 'and, 'or, 'xor are allowed logical operations upon typesets. Do you think it would be usefull to allow also 'intersect and 'union, to allow creation of combinations? |
BrianH 5-Aug-2009 [16496] | Already spported, Pekr :) |
Pekr 5-Aug-2009 [16497] | ok then. Carl started to write-down A77 changelog, so it means we might be close to the first release .... or do you think he will merge some other tickets? |
BrianH 5-Aug-2009 [16498x2] | We'll see. Part of the reason for the release is to sync his work with the rest of the community. So, soon after a77 I can merge my fixes of those 15 CureCode tickets I wrote about modules yesterday, as well as other mezzanine fixes. |
I think it's mostly a sign that we are reaching the end of this tunnel on plugins :) | |
Pekr 5-Aug-2009 [16500] | no, we are starting to see the light at the end of the tunnel :-) |
BrianH 5-Aug-2009 [16501x3] | We saw the light last week :) |
Hey, I just this moment figured out hot to do JIT-compiled native functions for R3 using Carl's new plugin model... | |
hot -> how | |
Pekr 5-Aug-2009 [16504] | :-) cool, isn't it? |
BrianH 5-Aug-2009 [16505] | If I can get that method to work, that would allow me to work on the compiled REBOL dialect that I have been waiting on user-defined function types for. This is good news! It moves forward my schedule by months :) |
Pekr 5-Aug-2009 [16506] | so you don't need u-types anymore? |
BrianH 5-Aug-2009 [16507x3] | By the way, the method would not work with R2's library wrapping model - it requires the command! model. |
I would need u-types for integration with .NET and other systems, but not for my compiled functions idea, as long as the compiled functions use the same frame-based marshalling interface that the plugin model uses. | |
Since the plugins only export 3 functions from their library, and dispatch calls to commands from a single function, I could add new commands at runtime as long as that function has some way to make sense of their indexes. Then I could make a plugin that wraps a JIT compilation library like libjit or libtcc. | |
Reichart 5-Aug-2009 [16510] | http://tlt.its.psu.edu/suggestions/international/bylanguage/japanese.html To elaborate on what Gabrielle said, in most languages, there is a large and small version of letters for use usually in sentence case, and also for abbreviations, etc. Over time these began to be written differently, so the large and small actually look different. But in Japanese, small letters have a completely separate meaning, sometimes used to elongate a sound, or form a subtle guttural stop. Here is a sample, it is VERY subtle. http://christopherfield.com/translation/images/hashiriame/story_a.gif In this image look for all the symbols that look like a backwards letter "C" (or letter "U" that fell to the left). Sentence 1 - 3rd from the right. Sentence 6 - 3rd from the left. Notice they are very subtle different sizes. That is an example. Bottom line, as stated, don't mess with caps with Japanese. (it was hard to find a GOOD example of this in the same image). |
BrianH 5-Aug-2009 [16511] | This could be an advantage - there are many languages that support capitalization as a concept, but many that don't. The ones that don't have more characters than the ones that do (I'm guessing). This means that we could use smaller tables/code to do the capitalization in LOWERCASE and UPPERCASE - valuable space savings for a tiny language like REBOL. |
Louis 6-Aug-2009 [16512x2] | Is the a replacement for read/lines? |
the = there | |
Louis 7-Aug-2009 [16514] | I will reword my question. In R3 how can I read in one line of unicode text at a time to process it? |
Henrik 7-Aug-2009 [16515] | I think that is not yet implemented. Just ask Pekr. He has talked alot about it. :-) |
Pekr 7-Aug-2009 [16516x5] | ah, damned, read/lines. What a crap :-) |
Well, my only objection was, that I did not agree to read/text, but was suggesting more general read/as data 'decoder .... | |
But BrianH is patient enough to explain me, that 'text operations are pretty common, and that they might deserve special treatment. | |
What I don't like about REBOL, is all those read-text, send-service, open-service and other tonnes of mezaninnes. But I think that actually I might reconsider my pov, and maybe I would prefer read-text or read-csv, which could incorporate tonnes of possible refinements, instead of giving 'read special /text refinement .... 'read is too low level in R3 .... | |
IIRC there was also problem with my proposed aproach, that currently decoders can't stream (and it really sucks), so that we could get double memory consumption - first reading text, then decoding it. That is imo why BrianH proposes read/text, to handle it in low level. But - I don't like, when architecture flaws are fixed by such workaround. Please give me streamed codecs and streamed parse instead ;-) | |
Sunanda 7-Aug-2009 [16521] | read/lines in R3.....Best we have in R3 so far is DELINE/LINES. Eg from the R3 console: deline/lines to-string read %user-db.r |
BrianH 7-Aug-2009 [16522x3] | READ/text wasn't my proposal, it was Carl's. I often write the CureCode tickets for other people's requests, if they are good ones. |
Pekr, expect the number of visible mezzanines to go down after the module system is fixed. The code is written already, but we are waiting for the plugin-related mezzanine changes before the overall module system changes can be merged in. | |
Louis, there may be a solution to your problem that involves direct port access, rather than a READ refinement... | |
Louis 7-Aug-2009 [16525] | Thanks for the feedback, everybody. Brian, I'll check into direct port access. |
Graham 8-Aug-2009 [16526x2] | Request ... I would like now/time to always return the seconds. |
Edge conditions every exact minute are annoying... | |
Sunanda 8-Aug-2009 [16528] | Annoying isn't it? Have you submitted a wish to curecode.org? For now, I use something like this: reduce [x: now/time x/hour x/minute x/second] ==[12:30 12 30 0] |
Graham 8-Aug-2009 [16529] | no, I'd thought I'd solicit opinions first! |
Sunanda 8-Aug-2009 [16530] | I think its a good idea! |
Graham 8-Aug-2009 [16531] | I just spent a few hours trying to debug someone else's code ... and this was the cause. |
Henrik 8-Aug-2009 [16532x2] | I agree. Good idea. |
however one can use: to-itime 11:2 == "11:02:00" | |
Pekr 8-Aug-2009 [16534] | to-integer now-time |
older newer | first last |