World: r4wp
[#Red] Red language group
older newer | first last |
Kaj 3-Jan-2013 [5045] | I'm doubtful about it, as it is also commonly used to denote bug numbers and such, which is now expensive in R3 |
Andreas 3-Jan-2013 [5046] | Hmm, why is that expensive now? |
Kaj 3-Jan-2013 [5047x2] | Each issue number adds a word to the word registry, that isn't garbage collected like strings are |
I would like to be proven wrong | |
Andreas 3-Jan-2013 [5049x2] | No that's true. But I wouldn't consider that particularly expensive (esp not with R3 effectively abandoning the word limit). |
For compiled Red, I think this would not matter at all (as strings are interned as well, afaik). | |
BrianH 3-Jan-2013 [5051x2] | As I mentioned in Rebol School (and elsewhere earlier), issues can be made to behave like strings to a certain extent even if they're words. To do that in compiled code you'd need to keep their spellings around though, unless you resolve all of those function calls statically (which you would be able to because issues would be immutable). |
Having issues be immutable and unique could lead to lower memory usage, Kaj. Sure, you wouldn't be able to garbage collect them, but additional copies of the same issue wouldn't add any additional memory. Plus, you can't necessarily GC strings either - only when you don't need to keep references to them anymore. It may depend on the app whether it's more efficient to have issues be strings or words. | |
Kaj 3-Jan-2013 [5053x2] | Symbols are structs in the Red runtime. If you have an app server running that handles issue!s, it will accumulate memory over time that you can't collect. It will be indistinghuishable from a memory leak |
You could even use it for a DoS attack | |
BrianH 3-Jan-2013 [5055] | Same with other word types, of course. |
Kaj 3-Jan-2013 [5056x4] | True, but that's not a good reason to increase the problem |
If you think about what you'd have to do to secure a server from memory overload, it would be reasonable to limit acceptable words to a certain dictionary, but it wouldn't be reasonable to limit issue! to a small range | |
All in all, I feel that the nature of issue! is not the same as the nature of word! | |
As programmers, we usually see forms such as #if and think it's a REBOL issue!, but that's not how it is used in common English | |
BrianH 3-Jan-2013 [5060] | They're not used at all in common english. You see them in Twitter-speak though. |
Arnold 3-Jan-2013 [5061] | Very sharp ;) |
Andreas 3-Jan-2013 [5062x2] | I think Kaj rather meant the name "issue", not the syntactical form (#...). |
(And I think we also had discussion about renaming the R3 type to e.g. keyword!.) | |
Kaj 3-Jan-2013 [5064x2] | Surely in English people write #1, #2 and such? |
Twitter certainly didn't invent them :-) | |
Maxim 3-Jan-2013 [5066x2] | literaly, it reads as 'number' it music it reads as 'sharp' any other use isn't from proper english afaik. |
(in music) | |
Kaj 3-Jan-2013 [5068] | Yes, in Dutch, we write nr. like no. in other languages |
BrianH 3-Jan-2013 [5069] | In business correspondence it can mean number, in Twitter speak it's a hashtag, in music it means someone wrote a sharp with the wrong character. In English, it's a symbol that means pound (the weight, not the currency), but it's not common anymore. |
Andreas 3-Jan-2013 [5070] | In American English, that is :) |
BrianH 3-Jan-2013 [5071] | I think it only precedes a word when it means number, or is in Twitter-speak. |
Kaj 3-Jan-2013 [5072] | It doesn't really mean pound; English keyboards have a pound sign (the currency, which is the weight of silver) where # is on American keyboards |
BrianH 3-Jan-2013 [5073] | It means pound on American keyboards. Maybe they don't use the character for that in England. We just use lb here now. |
Sunanda 3-Jan-2013 [5074] | UK keyboards also have the "#" character. And it's unshifted so it's more convenient than some other chars, such as "@" or "&" -- they are shifted on UK layouts # is called hash over here. |
PeterWood 3-Jan-2013 [5075] | Kaj: "Surely in English people write #1, #2 and such?" Certiainly not. An English person would never write that. An American would. |
DocKimbel 3-Jan-2013 [5076] | Kaj: I share your security concerns about an appserver, but I don't think that other words datatypes can really be more secure. As long as you can force the LOADing of arbitrary input strings (without even evaluating the code), you could use it to make the symbol table blow up the memory. |
Kaj 3-Jan-2013 [5077x3] | Peter, OK, but that's where issue! comes from |
Doc, my point is that one would be more likely to screen for limited word use than limited issue! use | |
Would it be possible to have a recycle feature for the symbols registry? | |
DocKimbel 3-Jan-2013 [5080x7] | Hardly, the symbol table purpose is to provide a mapping between an integer value (the symbol ID) and a string representation. If we could allow the removal of a symbol, we would need: 1) to be sure that a symbol is not used anymore anywhere (would require an equivalent of a full GC collection pass) before removing it. 2) maintain a list of freed "slots" in the symbol table for re-use. 3) being able to trigger the symbols-GC at relevant points in time. Even with that, it would still be hard to counter a LOAD-based attack on the symbol table. |
screen for limited word use That would need to happen at the LOAD level...not very clean from a design POV. | |
(but doable) | |
GC collection pass => GC mark pass | |
Actually the best defense against such attacks is to never use LOAD on untrusted sources. | |
In the case where potentially harmful input needs to be LOADed, the input string needs to be validated before LOADing it with some good heuristics. I don't see any other way. | |
Kaj: you should also note that refinements already exhibit exactly the same behavior as issue-as-word! You can use digits only in refinements. | |
BrianH 3-Jan-2013 [5087] | As a basic screen, you can check the length of what you're loading. It can't blow out your memory much beyond twice the length of the source (once to read it, once for the results). |
Gregg 3-Jan-2013 [5088x2] | I use issues for IDs, phone numbers, pseudo-GUIDs, and serial numbers. I use INCLUDE as well, and the other PREBOL bits that use them as keywords. Could I use a string for those things? Sure. But I like having a datatype with more meaning. |
Handy when parsing as well. | |
DocKimbel 4-Jan-2013 [5090x2] | Issue! datatype added: https://github.com/dockimbel/Red/commit/177b65e67dfc23b1fe7475686a65af49fee7e939 |
I think issue-as-string could still be useful, so I was wondering if supporting both would be a good idea. I could be achieved by adding a keyword! datatype, we could then have two syntaxes: #<keyword> ;-- for issue-as-word (keyword! datatype!) ##<issue> ;-- for issue-as-string (issue! datatype!) What do you think? | |
Arnold 4-Jan-2013 [5092] | makes sense to me. |
Kaj 4-Jan-2013 [5093] | I would prefer it to be the other way around |
DocKimbel 4-Jan-2013 [5094] | The main use for keywords is preprocessing directives. We are used to #include, #if, #either, ... rather than ##include, ##if, ##either, ... which look quite bad. I prefer to reserve the lighter syntax for the most frequent use-cases, which are keywords. |
older newer | first last |