World: r3wp
[Red] Red language group
older newer | first last |
Mchean 4-Oct-2011 [3521x2] | just quiet at the moment |
? | |
Andreas 4-Oct-2011 [3523x2] | Yes. |
Also: https://twitter.com/#!/red_lang/status/118396786737033216 | |
Mchean 4-Oct-2011 [3525] | thanks |
Kaj 9-Oct-2011 [3526x4] | Implemented horizontal and vertical box layouts in the GTK binding |
Added a widgets overview to the examples | |
Here's the current one: | |
gtk-view window [ gtk-position-center "Widgets Overview" icon "Red-48x48.png" vbox [ label "Vertical box" fixed [ label "Fixed layout" 5 25 button [50 25 "Quit" :gtk-quit] ] hbox [ label "Horizontal box" button ["Fill"] yes button "Expand" button ["Fixed"] no ] vbox [ label "Vertical box" button ["Fill"] yes button "Expand" button ["Fixed"] no ] yes ] ] | |
Dockimbel 11-Oct-2011 [3530] | Works fine on Win7. What are the yes/no keywords for? |
Kaj 11-Oct-2011 [3531x3] | I'm about to define names for them. :-) They were the most practical way to construct a dialect that results in proper settings for filling or fixating a box cell |
Did you resize the window? Then the working becomes clear | |
Not many floats are used in GTK, but I need them for layout alignment | |
Dockimbel 11-Oct-2011 [3534] | Ok, I see now what they are used for. :-) Are the extra brackets around some button titles a special convention you're using? |
Pekr 11-Oct-2011 [3535x2] | Hmm, no floats in Red/System will have to come anyway, no? :-) |
eh, minus "no" in above sentence :-) | |
Kaj 11-Oct-2011 [3537x2] | Normally a button needs more than one parameter, so it would always have brackets. But here they're only used as examples, so they only have a display text and the brackets can be left out |
I left them in for a while to make the separation with the optionally following layout parameters clearer, but in the latest version I reconsidered | |
Dockimbel 11-Oct-2011 [3539x2] | Anyone knows where to find exhaustive lists of invalid UTF-8 encoding ranges? |
I am calculating them by hand, so I might miss some. | |
Andreas 11-Oct-2011 [3541x3] | C0, C1, F5-FF must never occur in UTF-8. |
80-BF are continuation bytes. | |
Is that what you are after? | |
Dockimbel 11-Oct-2011 [3544] | Yes, but I was searching for an exhaustive list of rules. |
Andreas 11-Oct-2011 [3545x2] | RFC3629 has a (non-normative) ABNF, if I remember correctly. |
http://tools.ietf.org/html/rfc3629#section-4s | |
Dockimbel 11-Oct-2011 [3547x3] | Here are the parse rules I came up with so far: https://gist.github.com/1278718 |
I think I am missing some overlong combinations. | |
I am also unsure of the valid range of the 2nd byte in the four-bytes encoding. | |
Andreas 11-Oct-2011 [3550] | one-byte-codepoint: charset [#"^(00)" - #"^(7F)] |
Dockimbel 11-Oct-2011 [3551] | Right, fixing that. |
Andreas 11-Oct-2011 [3552x4] | tail-bytes: charset [#"^(80)" - #"^(BF)] two-byte-codepoint: reduce [charset [#"^(C2)" - #"^(DF)] tail-bytes] |
tail-bytes == cont-byte | |
three-byte-codepoint: reduce [ #"^(E0)" charset [#"^(A0)" - #"^(BF)] cont-byte | charset [#"^(E1)" - #"^(EC)"] 2 cont-byte | #"^(ED)" charset [#"^(80)" - #"^(9F)] cont-byte | charset [#"^(EE)" - #"^(EF)"] 2 cont-byte ] | |
four-byte-codepoint: reduce [ #"^(F0)" charset [#"^(90)" - #"^(BF)] 2 cont-byte | charset [#"^(F1)" - #"^(F3)"] 3 cont-byte | #"^(F4)" charset [#"^(80)" - #"^(8F)] 2 cont-byte ] | |
Dockimbel 11-Oct-2011 [3556x2] | Thanks, I see that everything I need is in http://tools.ietf.org/html/rfc3629#section-4 |
BrianH: what was the CureCode ticket where you've summed up the word! Unicode parsing rules? | |
BrianH 11-Oct-2011 [3558x3] | http://issue.cc/r3/1302for the ASCII range in R3. The R3 parser tends to be excessively forgiving outside the ASCII range, accepting too much, though I haven't done the thorough test. |
You might also consider looking at the source of INVALID-UTF? in R2, which is MIT licensed from R2/Forward. | |
It would still be a good idea to review the Unicode standard to determine which of the characters should be treated as spaces, but that would still be a problem for R3 because all of the delimiters it currently supports are one byte in UTF-8 for efficiency. If other delimiters are supported, R3's parser will be much slower. | |
Dockimbel 12-Oct-2011 [3561] | Thanks. For whitespaces, I have already taken higher Unicode codepoints into account (from this list: http://en.wikipedia.org/wiki/Whitespace_character). |
Andreas 12-Oct-2011 [3562x2] | Completely forgot about INVALID-UTF? :) |
After having a quick glance at it, at least for utf8 it's quite basic and does not take any of the above overlong combinations into account. | |
BrianH 12-Oct-2011 [3564x4] | The policy on overlong combinations was set by R3, where there isn't as much need to flag them. Overlong combinations are a problem in UTF-8 for code that works on the binary encoding directly, instead of translating to Unicode first. The only function in R3 that operates that way is TRANSCODE, so as long as it doesn't choke on overlong combinations there is no problem with them being allowed. It might be good to add a /strict option to INVALID-UTF? though to make it check for them. |
Speaking of which, I don't think anyone has tried overlong combinations with TRANSCODE yet. We should look into that. | |
(I mean, aside from Carl possible doing so internally.) | |
As long as they are interpreted exactly the same as the short encoding of the value, no problems. | |
Andreas 12-Oct-2011 [3568] | (Let's switch to !REBOL3.) |
Kaj 13-Oct-2011 [3569x2] | Implemented GTK table layouts |
For example: | |
older newer | first last |