World: r3wp
[!REBOL3-OLD1]
older newer | first last |
Geomol 19-Nov-2009 [19737] | Then you can add strict-lesser? and strict-greater?, but I wouldn't recommend it. :-) |
Chris 19-Nov-2009 [19738] | R2 difference: R2 >> join url:: "%23" == url::%23 R3 >> join url:: "%23" == url::%2523 |
Geomol 19-Nov-2009 [19739x2] | The "%" is encoded as "%25" for urls in R3, which is correct, I think. >> to char! #"^(25)" == #"%" |
See e.g. http://en.wikipedia.org/wiki/Percent-encoding#Percent-encoding_reserved_characters | |
Chris 19-Nov-2009 [19741x2] | This isn't helpful either: >> load join "url::" "%23" == url::# |
I know, but my string is already percent encoded... | |
Geomol 19-Nov-2009 [19743] | ah |
Maxim 19-Nov-2009 [19744x2] | but your string is not a url, its a string. |
the string should be the decoded value of the url. | |
Chris 19-Nov-2009 [19746x4] | If it did a full url-encode, that'd be good, but it doesn't. |
Just percent encoding, of the percent symbol. | |
Is this is a bug? - url::%23 and url::# are not the same: >> url::%23 == url::# | |
Blocked either way: >> qs: to-webform [q "&=#"] == "q=%26%3D%23" >> join url:: qs == url::q=%2526%253D%2523 >> load join "url::" qs == url::q=&=# | |
Arie 19-Nov-2009 [19750] | Henrik: OK. Thanks! |
BrianH 19-Nov-2009 [19751x2] | I've been looking over R3's url handling and decoding and it needs more work, some of which needs to be in the native syntax. |
Chris, url::%23 and url::# should not be the same. The purpose of percent encoding is to allow you to specify character values without them being treated as syntax. If you specify a # directly in an http url, for instance, it should be taken as the start of the anchor portion of the url. If you percent encode it, it shouldn't be an anchor. | |
Chris 19-Nov-2009 [19753x4] | Yep, hence the so far insurmountable problem I have. |
Even tried the equivalent of - read decode-url "uri::%23" - but somewhere it gets encoded again to %2523, not sure if that's specific to the http scheme implementation? I haven't dug enough. | |
Is there a reason why it should decode any percent encoded character on loading? | |
Here's the offender - how to make this work? -- http://search.twitter.com/search?q=%23REBOL | |
Gabriele 20-Nov-2009 [19757] | Chris: I have pointed out this flaw to Carl before R3 was started, and provided the correct code to handle URI according to the standards... |
Henrik 20-Nov-2009 [19758] | Is it in curecode? |
Chris 20-Nov-2009 [19759] | Don't seem to be able to register for curecode - get this message: "Sorry, this page cannot be displayed. Try again or contact the web site administrator" |
Henrik 20-Nov-2009 [19760] | ok, mention it in the curecode group, so dockimbel can look at it. |
BrianH 20-Nov-2009 [19761x4] | Gabriele, be sure to post the correct url parsing code here or in R3 chat. We will be sure to get it integrated into R3. Or you could integrate it yourself if you like. If there need to be specific changes to the url syntax as accepted by TRANSCODE, please note them here or in CureCode. Proper url handling is important, and now is the time to fix it. |
I have been thinking that urls should stay percent-encoded until they are decoded by DECODE-URL, so that percent-encoded characters won't be mistaken for syntax characters. (I don't claim this is my idea - I think you said it earlier, and I remember that.) Is this approach a good one? Have you thought of any gotchas or downsides to this? Will this require that urls have an associated decoded version that would be stored as well as the character version? Do you think we could get away with TRANSCODE enforcing the initial rules, then not checking again until it comes time for DECODE-URL to be called (on OPEN, for instance)? | |
Your code in Qtask for this was somewhat complex, but could be simplified with the new PARSE. Clarity is key here :) | |
The main gotcha so far to the keep-encoded approach is whether INSERT and APPEND should do some magic percent encoding or not. It seems that it may be a better approach to just assume that the programmer knows what they are doing and just insert what they say to insert as is, as long as the url character set restrictions are met. This would mean that the programmer would need to handle their own percent encoding where needed, and that INSERT or APPEND would not do any encoding or decoding. Or perhaps some non-syntax characters, such as space, could be encoded by MOLD instead of rejected and DECODE-URL just adjusted to not freak out when it seees them. What do you think? | |
Maxim 20-Nov-2009 [19765x3] | I vote for NO automatic encoding. |
it just makes everything totally confused and leads to very hard to fix bugs. | |
and breaks inter type linearity... if source is one type... something happens, when source is another type, something else happens... aaaarrrrggghhh :-( | |
BrianH 20-Nov-2009 [19768x2] | We have to do percent decoding to read urls. The question is when. |
Intertype linearity is more of a guideline anyways. If types behaved identically, there wouldn't be point to more than one :) | |
Maxim 20-Nov-2009 [19770x2] | its a question of taste, in R2 a lot of the series handling stuff in some types alienate me more than anything. |
for urls, I'll let you guys assess it... I'm the kind of guy that will do all with the string and just convert it to url at the end, its just much more useable that way... you have a better control over stuff like "/" in the path anyways. | |
Chris 20-Nov-2009 [19772] | I think I'd look for at least the following behaviour: >> url::%23# == url::%23# >> join url:: "%23#" == url::%23# >> join url:: " " ; space is not in the uri spec, so could arguably be converted == url:: >> read url::%23# ; dependent on the scheme, I guess == "GET %23" The problem with magic percent encoding is with the special characters. As it is now, it is impossible (so far as I can ascertain) to build an http url that encodes special characters eg "#=&%" - Twitter being a great case where an encoded # is integral to the service. Given though that the list of special characters is short and well defined, perhaps they could be the exception to a magic encoding rule. |
Rudolf 21-Nov-2009 [19773x2] | I have noticed the new developments in specifying bitsets. The NOT feature is potentially useful but needs much more work. E.g. there is no way to programmatically find out that a bitset has been specified with NOT. Try the following code: >> equal? charset [" "] charset [not " "] == true |
Besides a logic-valued function to determine if a bitset is specified wih NOT, one needs all functions (natives, actions) that work on bitsets to cater for the NOT-specification. So far, most of them plainly ignore this. | |
Pekr 21-Nov-2009 [19775] | I think that you should CureCode it :-) |
Gabriele 21-Nov-2009 [19776x4] | Brian... in how many places do I have to post it? Will a new place come out next here, and you'll tell me to make sure it's posted there? |
We have to do percent decoding to read urls. The question is when. - The standard TELLS you when... my document too... but since everything moves every few months, things get lost and forgotten. (besides, it could have been fixed back then, so there would be no need to worry about it now...) | |
next here | |
next here = "next year" | |
Pekr 21-Nov-2009 [19780x4] | what moves? |
There is CC for tickets, and there might be DocBase articles. One user "volunteered", reorganised it, and it got totally messy :-) | |
Then there is official R3 docs .... | |
BrianH: could you please look at my comment to #1343? :-) | |
Geomol 21-Nov-2009 [19784] | what moves? If you think, you might be able to figure out, which moves Gabriele talk about. (And you don't have to answer or comment this. Less noise and more thinking would be good for a change.) |
Rudolf 21-Nov-2009 [19785] | I have Curecoded part of it in #1328. I would be so happy to believe that all of this is still coming. Brian/Carl? |
Pekr 21-Nov-2009 [19786] | Geomol - my question was rhetorical. I think I do understand what Gabriele means, I just don't agree with the outcome. There are clear places where to post, easy as that. It is a bit difficult sometimes to get Carl's attention, but 80 tickets a month get such an attention. The development process of R3 might look chaotic, jumping from one area to the other, but if we want, and we care, we know how to get such an attention. I for one asked Carl privately about your concern towards R3 speed in certain situations. And you know what? I got some answer too. I asked Carl to comment to your ticket, he did so. In few hours. You could do just the same, no? It is very easy to become a naysayer, to express some worries, etc., but other thing is to actaully act, not just talk, and then your saying applies - "less noise and more thinking (and acting) would be good for a change" :-) .... and please - I think I don't need any guides on what should I comment, or not. But the fact is, that I don't want to let anyone to dismiss the hard work which is being put into R3. I don't care about myself at all, but I see it at least as dishonest to those, who really try to bring R3 out, and we have few such friends here ... |
older newer | first last |