r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3-OLD1]

Geomol
19-Nov-2009
[19737]
Then you can add strict-lesser? and strict-greater?, but I wouldn't 
recommend it. :-)
Chris
19-Nov-2009
[19738]
R2 difference:

	R2
	>> join url:: "%23"
	== url::%23

	R3
	>> join url:: "%23"
	== url::%2523
Geomol
19-Nov-2009
[19739x2]
The "%" is encoded as "%25" for urls in R3, which is correct, I think.

>> to char! #"^(25)"
== #"%"
See e.g. http://en.wikipedia.org/wiki/Percent-encoding#Percent-encoding_reserved_characters
Chris
19-Nov-2009
[19741x2]
This isn't helpful either:

	>> load join "url::" "%23"
	== url::#
I know, but my string is already percent encoded...
Geomol
19-Nov-2009
[19743]
ah
Maxim
19-Nov-2009
[19744x2]
but your string is not a url, its a string.
the string should be the decoded value of the url.
Chris
19-Nov-2009
[19746x4]
If it did a full url-encode, that'd be good, but it doesn't.
Just percent encoding, of the percent symbol.
Is this is a bug? - url::%23 and url::# are not the same:

	>> url::%23
	== url::#
Blocked either way:

	>> qs: to-webform [q "&=#"]        
	== "q=%26%3D%23"

	>> join url:: qs
	== url::q=%2526%253D%2523

	>> load join "url::" qs
	== url::q=&=#
Arie
19-Nov-2009
[19750]
Henrik: OK.  Thanks!
BrianH
19-Nov-2009
[19751x2]
I've been looking over R3's url handling and decoding and it needs 
more work, some of which needs to be in the native syntax.
Chris, url::%23 and url::# should not be the same. The purpose of 
percent encoding is to allow you to specify character values without 
them being treated as syntax. If you specify a # directly in an http 
url, for instance, it should be taken as the start of the anchor 
portion of the url. If you percent encode it, it shouldn't be an 
anchor.
Chris
19-Nov-2009
[19753x4]
Yep, hence the so far insurmountable problem I have.
Even tried the equivalent of - read decode-url  "uri::%23" - but 
somewhere it gets encoded again to %2523, not sure if that's specific 
to the http scheme implementation? I haven't dug enough.
Is there a reason why it should decode any percent encoded character 
on loading?
Here's the offender - how to make this work? -- http://search.twitter.com/search?q=%23REBOL
Gabriele
20-Nov-2009
[19757]
Chris: I have pointed out this flaw to Carl before R3 was started, 
and provided the correct code to handle URI according to the standards...
Henrik
20-Nov-2009
[19758]
Is it in curecode?
Chris
20-Nov-2009
[19759]
Don't seem to be able to register for curecode - get this message: 
"Sorry, this page cannot be displayed. Try again or contact the web 
site administrator"
Henrik
20-Nov-2009
[19760]
ok, mention it in the curecode group, so dockimbel can look at it.
BrianH
20-Nov-2009
[19761x4]
Gabriele, be sure to post the correct url parsing code here or in 
R3 chat. We will be sure to get it integrated into R3. Or you could 
integrate it yourself if you like. If there need to be specific changes 
to the url syntax as accepted by TRANSCODE, please note them here 
or in CureCode. Proper url handling is important, and now is the 
time to fix it.
I have been thinking that urls should stay percent-encoded until 
they are decoded by DECODE-URL, so that percent-encoded characters 
won't be mistaken for syntax characters. (I don't claim this is my 
idea - I think you said it earlier, and I remember that.)


Is this approach a good one? Have you thought of any gotchas or downsides 
to this? Will this require that urls have an associated decoded version 
that would be stored as well as the character version? Do you think 
we could get away with TRANSCODE enforcing the initial rules, then 
not checking again until it comes time for DECODE-URL to be called 
(on OPEN, for instance)?
Your code in Qtask for this was somewhat complex, but could be simplified 
with the new PARSE. Clarity is key here :)
The main gotcha so far to the keep-encoded approach is whether INSERT 
and APPEND should do some magic percent encoding or not. It seems 
that it may be a better approach to just assume that the programmer 
knows what they are doing and just insert what they say to insert 
as is, as long as the url character set restrictions are met. This 
would mean that the programmer would need to handle their own percent 
encoding where needed, and that INSERT or APPEND would not do any 
encoding or decoding. Or perhaps some non-syntax characters, such 
as space, could be encoded by MOLD instead of rejected and DECODE-URL 
just adjusted to not freak out when it seees them. What do you think?
Maxim
20-Nov-2009
[19765x3]
I vote for NO automatic encoding.
it just makes everything totally confused and leads to very hard 
to fix bugs.
and breaks inter type linearity... if source is one type... something 
happens, when source is another type, something else happens... aaaarrrrggghhh 
 :-(
BrianH
20-Nov-2009
[19768x2]
We have to do percent decoding to read urls. The question is when.
Intertype linearity is more of a guideline anyways. If types behaved 
identically, there wouldn't be point to more than one :)
Maxim
20-Nov-2009
[19770x2]
its a question of taste, in R2 a lot of the series handling stuff 
in some types alienate me more than anything.
for urls, I'll let you guys assess it... I'm the kind of guy that 
will do all with the string and just convert it to url at the end, 
 its just much more useable that way... you have a better control 
over stuff like "/" in the path anyways.
Chris
20-Nov-2009
[19772]
I think I'd look for at least the following behaviour:

	>> url::%23#
	== url::%23#
	>> join url:: "%23#"
	== url::%23#

 >> join url:: " " ; space is not in the uri spec, so could arguably 
 be converted
	== url:: 
	>> read url::%23# ; dependent on the scheme, I guess
	== "GET %23"


The problem with magic percent encoding is with the special characters. 
 As it is now, it is impossible (so far as I can ascertain) to build 
an http url that encodes special characters eg "#=&%" - Twitter being 
a great case where an encoded # is integral to the service.  Given 
though that the list of special characters is short and well defined, 
perhaps they could be the exception to a magic encoding rule.
Rudolf
21-Nov-2009
[19773x2]
I have noticed the new developments in specifying bitsets. The NOT 
feature is potentially useful but needs much more work. E.g. there 
is no way to programmatically find out that a bitset has been specified 
with NOT. Try the following code: 
>> equal? charset [" "] charset [not " "] 
== true
Besides a logic-valued function to determine if a bitset is specified 
wih NOT, one needs all functions (natives, actions) that work on 
bitsets to cater for the NOT-specification. So far, most of them 
plainly ignore this.
Pekr
21-Nov-2009
[19775]
I think that you should CureCode it :-)
Gabriele
21-Nov-2009
[19776x4]
Brian... in how many places do I have to post it? Will a new place 
come out next here, and you'll tell me to make sure it's posted there?
We have to do percent decoding to read urls. The question is when.

 - The standard TELLS you when... my document too... but since everything 
 moves every few months, things get lost and forgotten. (besides, 
 it could have been fixed back then, so there would be no need to 
 worry about it now...)
next here
next here
 = "next year"
Pekr
21-Nov-2009
[19780x4]
what moves?
There is CC for tickets, and there might be DocBase articles. One 
user "volunteered", reorganised it, and it got totally messy :-)
Then there is official R3 docs ....
BrianH: could you please look at my comment to #1343? :-)
Geomol
21-Nov-2009
[19784]
what moves?


If you think, you might be able to figure out, which moves Gabriele 
talk about. (And you don't have to answer or comment this. Less noise 
and more thinking would be good for a change.)
Rudolf
21-Nov-2009
[19785]
I have Curecoded part of it in #1328. I would be so happy to believe 
that all of this is still coming. Brian/Carl?
Pekr
21-Nov-2009
[19786]
Geomol - my question was rhetorical. I think I do understand what 
Gabriele means, I just don't agree with the outcome. There are clear 
places where to post, easy as that. It is a bit difficult sometimes 
to get Carl's attention, but 80 tickets a month get such an attention. 
The development process of R3 might look chaotic, jumping from one 
area to the other, but if we want, and we care, we know how to get 
such an attention. 


I for one asked Carl privately about your concern towards R3 speed 
in certain situations. And you know what? I got some answer too. 
I asked Carl to comment to your ticket, he did so. In few hours. 
You could do just the same, no? It is very easy to become a naysayer, 
to express some worries, etc., but other thing is to actaully act, 
not just talk, and then your saying applies - "less noise and more 
thinking (and acting) would be good for a change" :-)


.... and please - I think I don't need any guides on what should 
I comment, or not. But the fact is, that I don't want to let anyone 
to dismiss the hard work which is being put into R3. I don't care 
about myself at all, but I see it at least as dishonest to those, 
who really try to bring R3 out, and we have few such friends here 
...