• Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r4wp

[!REBOL3] General discussion about REBOL 3

Gregg
31-Mar-2013
[2170x5]
I have an updated SPLIT-PATH, modeled on Ladislav's implementation 
where it holds that

   file = rejoin split-path file


This does not match current REBOL behavior. His version arguably 
makes more sense, but will break code in cases like this:

	%/c/test/test2/ 
	REBOL      == [%/c/test/ %test2/]
	Ladislav's == [%/c/test/test2/ %""]


Ladislav's func only seems to go really wrong in the case of ending 
with a slash an that's the only slash in the value which return an 
empty path and entire filespec as  the target.

Schemes (http://) don't work well either.


REBOL also dirizes the file path if it's %. or %.., which Ladislav's 
does not. e.g.

	[%foo/ %../]  == split-path %foo/..
split-path: func [

 "Returns a block containing a path and target, by splitting a filespec."
	filespec [any-string!]
	/local target
][
	either any [
		; It's a url ending with a slash. This doesn't account for
		; formed URLs. To do that, we would have to search for "://"
		all [slash = last filespec]
		all [url? filespec  slash = last filespec]
		; Only one slash, and it's at the tail.
		all [target: find/tail filespec slash  tail? target]
	][
		reduce [copy filespec  copy %""]
	][
		target: tail filespec
		if slash = last target [decr target]
		target: any [find/reverse/tail target slash  filespec]
		reduce [copy/part filespec target  to file! target]
	]
]
The above matches Ladislav's REJOIN requirement, and handles a couple 
edge cases better. I have about 35 tests here, if people want to 
see them for discussion.
It leaves open the question of what the best results are in cases 
where the target is a dir. Should it be part of the path, returning 
no target? Should it be the target? Should it be the target if there 
is no traliing slash, but if there is a trailing slash it should 
be part of the path?
And could/should it be generalized by adding a /WITH option to specify 
a path delimiter other than slash?
Ladislav
31-Mar-2013
[2175]
Well, it really is worth it to find out what the preferences are 
and whether people like the "invariant" I proposed.
sqlab
1-Apr-2013
[2176]
why not use %.  as the last element of splitpath in case of a directory?
true = dir? %.
Gregg
1-Apr-2013
[2177x4]
It makes sense to me Anton. I don't know why SPLIT-PATH does what 
it does today, by automatically dirizing that result. If everyone 
agrees, then the next question is whether a trailing %. or %.. should 
be returned as part of the path, or as the target. That is, do we 
presume that they are directories?


SPLIT-PATH, today, returns the last dir in the path as the target, 
if the path ends in a dir. Here are some example values, and what 
SPLIT-PATH returns today.
;	%/c/test/test2/ [%/c/test/ %test2/]
;	%/c/test/test2  [%/c/test/ %test2]
;	%/c/test        [%/c/ %test]
;	%//test         [%// %test]
;	%/test          [%/ %test]
;	%/c/            [%/ %c/]
;	%/              [%/ (none)]
;	%//             [%/ %/]
;	%.              [%./ (none)]
;	%./             [%./ (none)]
;	%./.            [%./ %./]
;	%..             [%../ (none)]
;	%../            [%../ (none)]
;	%../..          [%../ %../]
;	%../../test     [%../../ %test]
;	%foo/..         [%foo/ %../]
;	%foo/.          [%foo/ %./]
;	%foo/../.       [%foo/../ %./]
;	%foo/../bar     [%foo/../ %bar]
;	%foo/./bar      [%foo/./ %bar]
To me, it's a matter of whether SPLIT-PATH should be consistent in 
how it handles the path, as a string to process, or whether it should 
try to be "helpful". The problem with being helpful is that it may 
make other things harder.
By saying that SPLIT-PATH always behaves the same way, depending 
on whether the path ends with a slash or not, it may not shortcut 
a few cases for us, but it does make it easy to reason about, and 
also to wrap for other behavior. e.g., you can always dirize the 
path before calling it.
Andreas
1-Apr-2013
[2181x2]
One test missing in your collection:
%foo [%./ %foo]
Also:
%"" [%./ %""]
Gregg
1-Apr-2013
[2183]
Thanks Andreas!
sqlab
1-Apr-2013
[2184x2]
after thinking again, I would perfer %./ as the last part of the 
result of split-path, as it has a trailing slash and it  is still 
the samel
if the argument was a directory
Gregg
1-Apr-2013
[2186]
What do you mean, if the arg was a directory? Can you give an example 
each way?
sqlab
1-Apr-2013
[2187]
split-path %test/ should give [%test/ %./]
Gregg
1-Apr-2013
[2188]
Why would that return anything for the target? That is, why not  
[%test/ %""]
sqlab
1-Apr-2013
[2189]
%"" looks strange, even if its allowed. %./ has a trailing slash, 
if someone wants to test for that
Andreas
1-Apr-2013
[2190]
I think I would prefer split-path so split into the last non-slash 
component (target), and the original path with that last non-slash 
component removed.
Gregg
1-Apr-2013
[2191x3]
Anton, but then you could never get an empty target, and you would 
have to compare to %./ as your empty value.
And then make sure that wasn't the end of the original path.
Andreas, can you give examples, to make sure I'm clear?
Andreas
1-Apr-2013
[2194x2]
Would behave mostly as the current split-path does.
(For the common cases.)
Gregg
1-Apr-2013
[2196]
So this is OK for you: %/c/test/test2/ [%/c/test/ %test2/]
Andreas
1-Apr-2013
[2197x3]
Yes, that's what I'd expect.
I'd also prefer a stronger invariant, as REJOIN is relatively weak 
for joining path components.
Something more along the lines of

set [d b] split-path f
f = d/:b
sqlab
1-Apr-2013
[2200]
what do you mean with an empty target?  %./ just means the target 
is a directory, the actual directory
Gregg
1-Apr-2013
[2201x2]
OK, using that as a quality test, here's where the current SPLIT-PATH 
fails:
Path quality failed: %/ %/none
Path quality failed: %// %/
Path quality failed: %. %./none
Path quality failed: %./ %./none
Path quality failed: %./. %././
Path quality failed: %.. %../none
Path quality failed: %../ %../none
Path quality failed: %../.. %../../
Path quality failed: %foo/.. %foo/../
Path quality failed: %foo/. %foo/./
Path quality failed: %foo/../. %foo/.././
Path quality failed: http:// http:/
Path quality failed: http://..http://../
Path quality failed: http://.http://./
Path quality failed: http://../.http://.././
And here's where my proposed SPLIT-PATH fails:
Path quality failed: %. %/.
Path quality failed: %.. %/..
Andreas
1-Apr-2013
[2203]
%. is tricky :)
Gregg
1-Apr-2013
[2204]
Anton, which is the behavior question. Do you expect SPLIT-PATH to 
return a target you can write to (i.e. a file)?
Andreas
1-Apr-2013
[2205]
Here's a few example values and what I would expect: 

http://sprunge.us/AaDJ


Where there is a third column, current R3 split-path differs from 
what I'd expect, and the third column is what split-path returns 
currently.
Gregg
1-Apr-2013
[2206]
Great! Since I haven't had coffee yet, the second column is *always* 
what you expect, correct?
Andreas
1-Apr-2013
[2207]
Yes.
Gregg
1-Apr-2013
[2208]
Got it.
Andreas
1-Apr-2013
[2209x2]
But with a "path component"-based invariant, the %. %.. and %/ cases 
will require more work to reconcile.


With a "string"-based invariant (rejoin), those cases could more 
easily be described with the neutral %"" element:
Here's some examples based on a string-based invariant:
http://sprunge.us/VeeH


Which is, I guess, what your proposed split-path already implements 
:)
Gregg
1-Apr-2013
[2211]
Yes, I believe that matches my current proposal.
Andreas
1-Apr-2013
[2212]
Basically strips off everything past the last / as target.
Gregg
1-Apr-2013
[2213]
And where it fails the path (p/:t) invariant is in these cases:
Path quality failed: %"" %/
Path quality failed: %foo %/foo
Path quality failed: %. %/.
Path quality failed: %.. %/..
Andreas
1-Apr-2013
[2214]
A slightly better path-based invariant:

set [d b] split-path f
clean-path/only f = clean-path/only d/:b
Gregg
1-Apr-2013
[2215x4]
It doesn't match your expected results in a number of cases though.
I don't have a *nix VM up here to check basename and dirname results.
That might be something else to consider.
Using the clean-path test, here's where my proposed version fails:

Path quality failed: %"" %/
         %""     ; clean-path test
         %/      ; clean-path p/:t
Path quality failed: %foo %/foo
         %foo    ; clean-path test
         %/foo   ; clean-path p/:t
Andreas
1-Apr-2013
[2219]
My first examples should match dirname/basename exactly.