World: r3wp
[Script Library] REBOL.org: Script library and Mailing list archive
older newer | first last |
Anton 15-Mar-2009 [780x5] | The view-script.r html source for the page correctly advertises the encoding as utf-8, so the browser shows it correctly. |
So I'm pretty happy with the way that script was handled by the software here. | |
Except for R2 console, of course. | |
R3 console seems to handle it better. | |
Any other scripts you can find showing problems ? | |
Sunanda 16-Mar-2009 [785x2] | Thanks Gabriele -- that's a clear explanation, and has helped me work out what is going on. Anton and Gabriele -- I have tried changing the charset we emit on the download to say UTF-8. But that makes little difference. As both of you note, once the file has been saved then (without a MAC-type resource fork) there is no obvious indication of the encoding. And several editors I have tried get it wrong -- thus "revealing" the extra ASCII chars. Not sure what the solution is other than to de-UTF-8 files on download. |
Anton -- not yet run a crawl to check for other scripts with high ascii chars. | |
Anton 16-Mar-2009 [787x3] | Which editors? I think most editors these days allow manually changing the encoding, so developers who notice strange characters can just change it themselves. Maybe it would be helpful to add a rebol.org library script header advertising the encoding (when it is known, and when not). I don't recommend 'de-UTF-8'ing files on download - that's just going to confuse things more, especially when the file is view-script.r'd as utf-8 just beforehand. |
It seems the responsibility lies with the clients to interpret encodings properly. As we move to a unicode world, software assuming 8-bit encodings are some ASCII encoding should drop off. But until the transition is complete, there's not much we can do about client software guessing wrong like that, except stating the encoding in the script header, in the web page that provides the download link, and by helping confused newbies. | |
Are rebol.org uploaders asked to declare the encoding used? | |
swall 16-Mar-2009 [790] | If the offending downloaded script is executed in Rebol/Core, the extra ASCII chars are also present in the executed code. The script defines ½ to be 0.5. If "help ½" is typed into the console, the result is "Found these words: ½ decimal! 0.5". However, if the script is executed in Rebol/View, the result is "½ is a decimal of value: 0.5". It seems that View handles it correctly, while Core doesn't. |
Sunanda 16-Mar-2009 [791] | Thanks guys. Other scripts with the same problem.....there are a couple. About 10% of all scripts have at least one extended ASCII char....But most of them are acceptable in LATIN-1 code page / charset (eg copyright symbol, some accented letters). It's just a very few scripts that use 1/4 and similar symbols that cause the problem. What other editors? Windows NOTEPAD is one example of a common one that gets this wrong. |
swall 16-Mar-2009 [792] | Vim and Editor² display the chars incorrectly. Notepad++ shows the chars correctly. |
Sunanda 16-Mar-2009 [793] | Of the various editors / word processors I have immediately to hand: -- credit.exe -- [my usual editor] shows incorrect chars, and has no option to switch to UTF-8 -- open office writer -- works fine if you take the UTF-8 option when asked -- ms word -- claims file is corrupt -- word perfect -- makes a complete mess -- R2/View's built in editor ( editor %/c/path to my local copy//ascii-math.r) -- shows incorrect chars |
Anton 17-Mar-2009 [794x2] | Vim supports unicode and on my system shows the characters correctly. |
Ok, so there are some editors which don't support unicode, don't guess encoding correctly, or can change encoding only with difficulty. How about this suggestion; if a rebol.org script is known to be UTF-8, then an additional link should appear: [Download as ASCII] download-a-script?script-name=ascii-math.r&encoded-as=8-bit-ascii which transcodes a UTF-8 file to ASCII. Just have to get a conversion function in place for this to work. | |
Gabriele 17-Mar-2009 [796x2] | Sunanda: given that R2 uses the host current code page, I think the best way would be for the user to convert the script after downloading it. On Linux or Mac for eg, UTF-8 is perfect for Core scripts as the terminal is UTF-8. On Windows or for View scripts, you'll get the host code page displayed anyway, so the user has to do the conversion. A tool to do that automatically would be nice (I have the code, it will be released soon, but you may need to wait a couple weeks more). |
All these troubles go away with R3... but I think it would be nice if R2 recognized UTF-8 and converted it on the fly; we could add a BOM at the beginning to make that easier. | |
Chris 17-Mar-2009 [798] | http://en.wikipedia.org/wiki/Byte_Order_Mark#cite_note-0 |
swall 17-Mar-2009 [799x2] | Anton: you're right Vim does display the file correctly, although not by default. I guess it helps when you read the manual. :-) |
Gabriele: Where is the host code page set? On Windows, is it set differently for View and Core? Is that why the downloaded script works as expected in View but not in Core? | |
Anton 17-Mar-2009 [801x2] | Yes, use of BOM has its own troubles. I don't think it's a good idea. |
swall, yes, strange, I can't remember configuring vim for utf-8 (I don't use it regularly), but it displayed correctly straight away for me. Must be some dark config option or something... | |
Sunanda 17-Mar-2009 [803] | Thanks everyone. I think our first step is to add a warning to any download for scripts that contain UTF-8 chars. So, for that I need a function: utf-8?: func [data [string!] [ ...] ; returns true or false [and perhaps "not sure" in ambiguous cases] I've done the easy part :-) Can anyone help with the difficult "..." part ? It is not as simple as just looking for ASCII > 128 .... some high ASCII is acceptable as part of, say, ISO 8859-1 |
PeterWood 17-Mar-2009 [804x2] | I have a function which finds utf-8 multi byte character sequences in a string. Given the code ranges for mulit-byte characters, it would be rare to find such a sequence accidentally. |
It's about 65 lines so rather than post it here I will email you a copy. | |
Sunanda 18-Mar-2009 [806] | Thanks. |
Anton 18-Mar-2009 [807] | Ok, so things seem to be proceeding well. The rebol.org Library's support for utf-8 was actually stronger than thought, and what're being added are functions to help deal with legacy client apps which misidentify the file encoding. |
PeterWood 18-Mar-2009 [808] | It's not just legacy client apps unless you consider all Rebol/View scripts as legacy apps. |
Anton 18-Mar-2009 [809x2] | Yes, I do. |
I understand what you mean, and obviously the definition of "legacy" is a bit fuzzy. | |
Sunanda 18-Mar-2009 [811] | Using Peter's code (thanks again!), I've made two changes to the download-a-script link: 1. if we find UTF-8 chars in a script, we download it with the HTTP content type charset=utf-8 But that probably makes no practical difference. A downloaded script will be saved by the browser, and then opened by a text editor. The text editor is unlikey to be passed the charset setting. So: 2. Scripts with UTF-8 encoding are downloaded with a few lines of comment at their top. The comment explains the possible problem. Thanks to all for the comments and help with getting things this far. |
Anton 19-Mar-2009 [812x2] | Sundanda, good job, I was hoping you'd do that, and you did. |
I'm interested in the utf-8 detection function. Can it be published? | |
Maxim 20-Mar-2009 [814x2] | sunanda: I have a feature proposal for you :-) it would be nice to be able to supply a single picture to link with the scripts. this image (jpg, png, gif) would have hefty size limitation and I think only one image per script should be enough, but having this alongside the various listings of the application and within searches, new scripts, etc would be really cool. sometimes, if you see a thumbnail (ui grab, console example, logo, output gfx, whatever), it will help raise people's curiosity. this could probably benefit quite a few scripts, which are possibly overlooked. having a simple search filter of scripts with pics, could also help people to quickly find usefull things at a glance. what do you think? it could start out really simple, and slowly thumbnails could creep into various listings of scripts. |
maybe, we could eventually have more than one picture, like pics which are specifically tagged as gui screenshots, for example. | |
Sunanda 20-Mar-2009 [816] | Nice idea, thanks ..... Let me think about it. |
Maxim 20-Mar-2009 [817] | cool, let me know if you want to test it, I'll be happy to supply imgs for my scripts. |
Alan 22-Mar-2009 [818] | re:pictures. Some but not all scripts also have docs, so that might be a good place to add them.For those that don't,a clickable small thumbnail? |
Maxim 26-Mar-2009 [819] | you mean add the pictures at the header of docs? or allow us to use the pics within the docs? |
Sunanda 26-Mar-2009 [820x3] | Max, I assumed you meant have the pic on the main page for the script, eg for liquid.r you'd see a thumbnail here: http://www.rebol.org/view-script.r?script=liquid.r |
That's a nice idea, though there are some technical CSS issues......For example, the actual script is displayed in a <pre> block. That means images may not float where you'd expect them. It'll take some experimentation to find the best way to do it. | |
....Maybe a better slot for a thumbnail would be in the LHS menu, just under the <Script Library Home> link. That would keep it out of the flow of the page. Please suggest better ideas :-) | |
mhinson 14-Apr-2009 [823] | Hi, I am very new to Rebol so appologies if my questions are very simple. I have been trying to use functions & examples from the library by pasting them into the REBOL/View console. When I do this I find most of them produce errors or lock up the console so I have to restart it. What am I doing wrong please? Is there some trick to this that is so obvious that no one has mentioned it? Thanks, |
Pekr 14-Apr-2009 [824] | give me an example of such script. The thing might be, that some scripts are already dated, but most of them should work ... |
sqlab 14-Apr-2009 [825x2] | There are two file versions in the library, one for viewing, one for downloading. Did you use the one from http://www.rebol.org/download-a-script.r?script-name=....r Maybe the other ones have problems. |
Mike I checked your library example from the I'm new group producing errors. There is probably a weakness, as the script does not regard comment lines. A short enhancement would be parse-ini-file: func [ file-name [file!] /local ini-block current-section parsed-line section-name ][ ini-block: copy [] current-section: copy [] foreach ini-line read/lines file-name [ if #";" <> first ini-line [ ; do not process comment lines section-name: ini-line error? try [section-name: first load/all ini-line] either any [ error? try [block? section-name] not block? section-name ][ parsed-line: parse/all ini-line "=" append last current-section parsed-line/1 append last current-section parsed-line/2 ][ append ini-block current-section current-section: copy [] append current-section form section-name append/only current-section copy [] ] ;; either ] ] ;; for append ini-block current-section return to-hash ini-block ] | |
Sunanda 15-Apr-2009 [827] | Thanks, sqlab......That works fine. (I should read the documentation in future before writing the script :) I've updated the script in the Library: http://www.rebol.org/view-script.r?script=parse-ini.r |
mhinson 15-Apr-2009 [828] | Thanks for your attenton to my questions. It seems I stumbled across the need for support of comments in ini files. I was also trying to cut & paste from the viewing version of some scripts, rather than the download version as I did not realise there was a destinction. It is also possiable that my cut & pastes were not complete perhaps, as the scripts that would not run like that before seem to work ok now. I notice that a lot of things show anomolous behaviour when used by inexperienced users who lack confidence. It is like they know who they can play tricks on, & who won't stand for it. |
Sunanda 15-Apr-2009 [829] | Cut'n'paste ought to work. Though the download format is safer as it has been less processed (no escaping of embedded HTML codes etc). I can sympathise -- I generally notice _any_ new device I buy (from computers to wrist watches) won't work for the first half hour or so, But as soon as I start to think that either it's defective, or I am very stupid, then it starts to play fair. Animism is still a good first attempt at understanding the universe :-) |
older newer | first last |