World: r3wp

Join the discussions in the REBOL3 world...

[Red] Red language group

older newer	first last
Mchean 4-Oct-2011 [3519]	has the Red google group moved somewhere else, don't see any activity
Andreas 4-Oct-2011 [3520]	It's still at http://groups.google.com/group/red-lang
Mchean 4-Oct-2011 [3521x2]	just quiet at the moment
Mchean 4-Oct-2011 [3521x2]	?
Andreas 4-Oct-2011 [3523x2]	Yes.
Andreas 4-Oct-2011 [3523x2]	Also: https://twitter.com/#!/red_lang/status/118396786737033216
Mchean 4-Oct-2011 [3525]	thanks
Kaj 9-Oct-2011 [3526x4]	Implemented horizontal and vertical box layouts in the GTK binding
	Added a widgets overview to the examples
	Here's the current one:
	gtk-view window [ gtk-position-center "Widgets Overview" icon "Red-48x48.png" vbox [ label "Vertical box" fixed [ label "Fixed layout" 5 25 button [50 25 "Quit" :gtk-quit] ] hbox [ label "Horizontal box" button ["Fill"] yes button "Expand" button ["Fixed"] no ] vbox [ label "Vertical box" button ["Fill"] yes button "Expand" button ["Fixed"] no ] yes ] ]
Dockimbel 11-Oct-2011 [3530]	Works fine on Win7. What are the yes/no keywords for?
Kaj 11-Oct-2011 [3531x3]	I'm about to define names for them. :-) They were the most practical way to construct a dialect that results in proper settings for filling or fixating a box cell
	Did you resize the window? Then the working becomes clear
	Not many floats are used in GTK, but I need them for layout alignment
Dockimbel 11-Oct-2011 [3534]	Ok, I see now what they are used for. :-) Are the extra brackets around some button titles a special convention you're using?
Pekr 11-Oct-2011 [3535x2]	Hmm, no floats in Red/System will have to come anyway, no? :-)
Pekr 11-Oct-2011 [3535x2]	eh, minus "no" in above sentence :-)
Kaj 11-Oct-2011 [3537x2]	Normally a button needs more than one parameter, so it would always have brackets. But here they're only used as examples, so they only have a display text and the brackets can be left out
Kaj 11-Oct-2011 [3537x2]	I left them in for a while to make the separation with the optionally following layout parameters clearer, but in the latest version I reconsidered
Dockimbel 11-Oct-2011 [3539x2]	Anyone knows where to find exhaustive lists of invalid UTF-8 encoding ranges?
Dockimbel 11-Oct-2011 [3539x2]	I am calculating them by hand, so I might miss some.
Andreas 11-Oct-2011 [3541x3]	C0, C1, F5-FF must never occur in UTF-8.
	80-BF are continuation bytes.
	Is that what you are after?
Dockimbel 11-Oct-2011 [3544]	Yes, but I was searching for an exhaustive list of rules.
Andreas 11-Oct-2011 [3545x2]	RFC3629 has a (non-normative) ABNF, if I remember correctly.
Andreas 11-Oct-2011 [3545x2]	http://tools.ietf.org/html/rfc3629#section-4s
Dockimbel 11-Oct-2011 [3547x3]	Here are the parse rules I came up with so far: https://gist.github.com/1278718
	I think I am missing some overlong combinations.
	I am also unsure of the valid range of the 2nd byte in the four-bytes encoding.
Andreas 11-Oct-2011 [3550]	one-byte-codepoint: charset [#"^(00)" - #"^(7F)]
Dockimbel 11-Oct-2011 [3551]	Right, fixing that.
Andreas 11-Oct-2011 [3552x4]	tail-bytes: charset [#"^(80)" - #"^(BF)] two-byte-codepoint: reduce [charset [#"^(C2)" - #"^(DF)] tail-bytes]
	tail-bytes == cont-byte
	three-byte-codepoint: reduce [ #"^(E0)" charset [#"^(A0)" - #"^(BF)] cont-byte \| charset [#"^(E1)" - #"^(EC)"] 2 cont-byte \| #"^(ED)" charset [#"^(80)" - #"^(9F)] cont-byte \| charset [#"^(EE)" - #"^(EF)"] 2 cont-byte ]
	four-byte-codepoint: reduce [ #"^(F0)" charset [#"^(90)" - #"^(BF)] 2 cont-byte \| charset [#"^(F1)" - #"^(F3)"] 3 cont-byte \| #"^(F4)" charset [#"^(80)" - #"^(8F)] 2 cont-byte ]
Dockimbel 11-Oct-2011 [3556x2]	Thanks, I see that everything I need is in http://tools.ietf.org/html/rfc3629#section-4
Dockimbel 11-Oct-2011 [3556x2]	BrianH: what was the CureCode ticket where you've summed up the word! Unicode parsing rules?
BrianH 11-Oct-2011 [3558x3]	http://issue.cc/r3/1302for the ASCII range in R3. The R3 parser tends to be excessively forgiving outside the ASCII range, accepting too much, though I haven't done the thorough test.
	You might also consider looking at the source of INVALID-UTF? in R2, which is MIT licensed from R2/Forward.
	It would still be a good idea to review the Unicode standard to determine which of the characters should be treated as spaces, but that would still be a problem for R3 because all of the delimiters it currently supports are one byte in UTF-8 for efficiency. If other delimiters are supported, R3's parser will be much slower.
Dockimbel 12-Oct-2011 [3561]	Thanks. For whitespaces, I have already taken higher Unicode codepoints into account (from this list: http://en.wikipedia.org/wiki/Whitespace_character).
Andreas 12-Oct-2011 [3562x2]	Completely forgot about INVALID-UTF? :)
Andreas 12-Oct-2011 [3562x2]	After having a quick glance at it, at least for utf8 it's quite basic and does not take any of the above overlong combinations into account.
BrianH 12-Oct-2011 [3564x4]	The policy on overlong combinations was set by R3, where there isn't as much need to flag them. Overlong combinations are a problem in UTF-8 for code that works on the binary encoding directly, instead of translating to Unicode first. The only function in R3 that operates that way is TRANSCODE, so as long as it doesn't choke on overlong combinations there is no problem with them being allowed. It might be good to add a /strict option to INVALID-UTF? though to make it check for them.
	Speaking of which, I don't think anyone has tried overlong combinations with TRANSCODE yet. We should look into that.
	(I mean, aside from Carl possible doing so internally.)
	As long as they are interpreted exactly the same as the short encoding of the value, no problems.
Andreas 12-Oct-2011 [3568]	(Let's switch to !REBOL3.)
older newer	first last