• Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r4wp

[!REBOL3] General discussion about REBOL 3

Andreas
1-Apr-2013
[2210]
Here's some examples based on a string-based invariant:
http://sprunge.us/VeeH


Which is, I guess, what your proposed split-path already implements 
:)
Gregg
1-Apr-2013
[2211]
Yes, I believe that matches my current proposal.
Andreas
1-Apr-2013
[2212]
Basically strips off everything past the last / as target.
Gregg
1-Apr-2013
[2213]
And where it fails the path (p/:t) invariant is in these cases:
Path quality failed: %"" %/
Path quality failed: %foo %/foo
Path quality failed: %. %/.
Path quality failed: %.. %/..
Andreas
1-Apr-2013
[2214]
A slightly better path-based invariant:

set [d b] split-path f
clean-path/only f = clean-path/only d/:b
Gregg
1-Apr-2013
[2215x4]
It doesn't match your expected results in a number of cases though.
I don't have a *nix VM up here to check basename and dirname results.
That might be something else to consider.
Using the clean-path test, here's where my proposed version fails:

Path quality failed: %"" %/
         %""     ; clean-path test
         %/      ; clean-path p/:t
Path quality failed: %foo %/foo
         %foo    ; clean-path test
         %/foo   ; clean-path p/:t
Andreas
1-Apr-2013
[2219]
My first examples should match dirname/basename exactly.
Gregg
1-Apr-2013
[2220]
OK, great.
Andreas
1-Apr-2013
[2221x2]
In effect, that is :)
dirname/basename does a clean-path before splitting.
Gregg
1-Apr-2013
[2223x2]
:-) Let me play with an idea here for a bit.
Let's widen the discussion a bit. Spitting a string at a delimiter. 
Easy enough to define clear behavior if the series contains the delimiter, 
but what if it doesn't? Most split funcs return an array, splitting 
at each dlm. If no dlm, return the original series as the only element 
in that array. 


What if we always want to return two elements? e.g., we have a SPLIT-AT 
func that can split a series into two parts, given either an integer 
index or value to match. Let's also give it a /LAST refinement, so 
it can split at the last matching value found, like FIND/LAST works. 


Given that, what do you expect in the case where the dlm (e.g. "=") 
is not in the series?

    SPLIT-AT "abcdef" "="   == [? ?]
    SPLIT-AT/LAST "abcdef" "="    == [? ?]
Maxim
1-Apr-2013
[2225]
I haven't had the time to follow all the discussion in detail, but 
to me, the second part of split-path should NEVER return a directory 
path. 


when doing   set [dir file]  I should be able to count on the fact 
that the second part is either a file or none.  The same for the 
first part which should always be none or a dir.  I have my own implementation 
in R2 which makes this strict and it simplifies a lot of code.
so we can do with absolute certainty:

if second set [dir file] split  path [   ]


IIRC some of the versions of my split perform a clean-path to simplify 
and add robustness to the result.
Gregg
1-Apr-2013
[2226]
Thanks for posting Max. With 5 of us talking about it, we have 5 
opinions so far. :-) The one thiing we all seem to agree on is that 
we want consistent behavior, which we don't have right now.
sqlab
2-Apr-2013
[2227]
Maxims method sounds reasonable
Ladislav
2-Apr-2013
[2228x6]
Re: 

One test missing in your collection:
%foo [%./ %foo]

- this test violates the "invariant"
Re:

%/c/test/test2/ [%/c/test/ %test2/]


- this test does not violate anything but it does not split the "pathfile" 
to "path" and "file" parts
hmm, I may be wrong in "it does not violate anything" - in fact, 
it contradicts the help string of the function
(I do not object against adjustment, but would expect the help string 
to be changed as well to be compatible with this)
(if it is the preferred behaviour)
Regarding the split-path behaviour in the %foo case. I stongly object 
against the proposal to obtain [%. %foo], since for example INCLUDE 
whan obtaining %foo with empty path uses INCLUDE-CTX/PATH to find 
%foo (which may even exclude the %./ directory if it is not in INCLUDE-CTX/PATH), 
while when obtaining %./foo it just finds the file in the current 
directory (which is not equivalent)
Maxim
2-Apr-2013
[2234x2]
IMHO %foo should return  [none %foo]
split-path shoudn't invent information which isn't given to it
Ladislav
2-Apr-2013
[2236]
As said, I prefer [%"" %foo] to have the invariant that file = rejoin 
split-path file
Maxim
2-Apr-2013
[2237]
question is, is that invariant useful? 


really, I like consistency almost above all else, but I prefer when 
it not just neutral.   getting empty file specs is very awkward to 
use and doesn't work well with all the none transparency which makes 
a lot of the conditional code so easy to read.  one reason this is 
so readable in REBOL is the limited use of equality operations, when 
doing complex decision making.
Ladislav
2-Apr-2013
[2238]
question is, is that invariant useful?
 - for me it is, but, of course, it is a matter of preference
Maxim
2-Apr-2013
[2239]
I agree.
Bo
2-Apr-2013
[2240]
I prefer

split-path %foo
== [%./ %foo]


The reason is because I believe split-path shouldn't require an extra 
check if all you want to do is read the base directory that a file 
is in.  I think this is a common use of split-path.
Ladislav
2-Apr-2013
[2241x2]
hmm, but Bo, that would make

    %foo

equivalent to

    %./foo

, which is not good IMO
(just because they *are not* equivalent)
Gregg
2-Apr-2013
[2243]
Do our preferences come from the basic difference of whether we want 
SPLIT-PATH to be "smart" about file specs, or whether it should assume 
nothing (the REJOIN invariant case)? For example, Andreas's path 
invariant (p/:t) makes a lot of sense, but some of his examples' 
results look wrong when just viewed as results. e.g.:

;   %/              [%/ %/]
;   %//             [%/ %/]
;   %./             [%./ %./]
Ladislav
2-Apr-2013
[2244]
In my opinion, the behaviour is neither simple nor useful.
Gregg
2-Apr-2013
[2245x2]
And the reason I posted the SPLIT-AT question was to see if we could 
find a solution for both.
Ladislav, you mean the examples I just posted?
Ladislav
2-Apr-2013
[2247]
Yes, the most recently posted behaviour examples.
Gregg
2-Apr-2013
[2248x4]
Got it. As you might all guess, since my proposal is most like Ladslav's, 
that's my current preference. As I posted, it only misses a couple 
edge cases to also meet the path invariant. 


For Max, I undersand the value of NONE. So much so that I have an 
NONE-OR-EMPTY? mezz.
For reference, those cases are:

Path quality failed: %. %/.
Path quality failed: %.. %/..
And while I understand that a file with no path implies the current 
directory, we lose information by assuming it. For example, if I 
let a user specify a path or filename, and I split it, now I can't 
tell if they gave me just a filename, or if they gave me %./<file>.
Can we resolve our differences with a refinement?
Ladislav
2-Apr-2013
[2252x2]
we have got quite a few combinations to consider:

missing path:

* yielding %.
* yielding %""
* yielding #[none] 

missing file

* yielding the last directory in the path
* yielding %""
* yielding #[none]


In total that is 6 variants but only some combinations make sense, 
I think
sorry, I mean 9
Gregg
2-Apr-2013
[2254x2]
The NONE case, while potentially useful, is only in Max's custom 
version. The current version in REBOL only returns NONE in a few 
edge cases, which I think we all agree is wrong.
And it doesn't satisfy either the string (REJOIN) or path invariant. 
If we care about either of those, it's a problem.
Maxim
2-Apr-2013
[2256]
but what is usefull in the rejoin invariant?  we know the path before 
the split...
Andreas
2-Apr-2013
[2257x3]
The plain "path-invariant" (just d/:b) I posted earlier was too simple, 
only the later refined one, including clean-path caters for the corner 
cases I posted as well.
Maxim, the problem with requiring that SPLIT-PATH should _never_ 
return a directory as second component, is that SPLIT-PATH cannot 
decide that based on a file! alone.
It could only do that in relation to a file system or with the simple 
heuristic used by DIR? as well: based on the presence of absence 
of a trailing slash.