Rebol and FTP (recursive stuff in particular)
[1/9] from: mat:eurogamer at: 14-Dec-2000 13:42
Hi folks,
I'd like to write a backup script that backs up the files on an FTP
site. This because my cheap mass hosting has already dumped
everyone's files. :-/
Anyhow, there's no shortage of recursive directory routines lying
around the place. Unfortunately they don't work with web sites. The
reason for that being that Rebol appears to keep trying to open fresh
FTP sessions every time you reference a URL - even if there's a
current session logged into the same site. This seems like stupid
behavior given it doesn't logout any of these threads either.
The solution, I suppose, is to do a low level port-based FTP client
type function that operates intelligently.
I'd really rather not do that!
Does anyone have an idea how to try get the high-level FTP stuff in
Rebol not to be so stupid?
--
Mat Bettinson - EuroGamer's Gaming Evangelist with a Goatee
http://www.eurogamer.net | http://www.eurogamer-network.com
[2/9] from: holger:rebol at: 14-Dec-2000 8:23
On Thu, Dec 14, 2000 at 01:42:41PM +0000, Mat Bettinson wrote:
> Hi folks,
> I'd like to write a backup script that backs up the files on an FTP
<<quoted lines omitted: 6>>
> current session logged into the same site. This seems like stupid
> behavior given it doesn't logout any of these threads either.
No, REBOL caches and reuses the FTP control connection across
subsequent accesses to the same host and directory. A new data
connection has to be created for each file being transfered. That
is not a limitation of REBOL, but rather a requirement of the FTP
protocol. The only situation when REBOL creates a new control
connection is when you change directories. That is necessary because
FTP provides no standard way to remotely navigate through the
directory tree (in particular up the tree) without side effects.
--
Holger Kruse
[holger--rebol--com]
[3/9] from: mat:eurogamer at: 14-Dec-2000 16:40
Heya Holger,
HK> No, REBOL caches and reuses the FTP control connection across
HK> subsequent accesses to the same host and directory.
But doesn't actually close any. Is there a way of forcing them closed?
How are you supposed to implement some sort of recursive system with
high-level FTP when Rebol tries to open as many threads as their are
directories? Clearly the concurrent user limit for the site will be
hit, to say nothing of the fact this is not efficient to say the
least.
HK> A new data
HK> connection has to be created for each file being transfered. That
HK> is not a limitation of REBOL, but rather a requirement of the FTP
HK> protocol.
Indeed, but it's the control connection which is limited on servers.
HK> The only situation when REBOL creates a new control
HK> connection is when you change directories. That is necessary because
HK> FTP provides no standard way to remotely navigate through the
HK> directory tree (in particular up the tree) without side effects.
What, other than cd .. ? I don't get it.
Assuming for a second that I buy this argument, and I'm really not
sure I do, then how would you recommend a recursive directory scan be
implemented in Rebol?
--
Mat Bettinson - EuroGamer's Gaming Evangelist with a Goatee
http://www.eurogamer.net | http://www.eurogamer-network.com
[4/9] from: deryk:iitowns at: 15-Dec-2000 0:49
Mat Bettinson wrote:
> Assuming for a second that I buy this argument, and I'm really not
> sure I do, then how would you recommend a recursive directory scan be
> implemented in Rebol?
You being The Fingers dare doubt the knowledge of The Holger? tsk ;)
Deryk
[5/9] from: mat:eurogamer at: 14-Dec-2000 17:21
Heya Deryk,
>> Assuming for a second that I buy this argument, and I'm really not
>> sure I do, then how would you recommend a recursive directory scan be
>> implemented in Rebol?
DR> You being The Fingers dare doubt the knowledge of The Holger? tsk ;)
Hehe, I removed my brackets "(and I'm sure I'm being deeply foolish in
the process)" :)
--
Mat Bettinson - EuroGamer's Gaming Evangelist with a Goatee
http://www.eurogamer.net | http://www.eurogamer-network.com
[6/9] from: holger:rebol at: 14-Dec-2000 9:48
On Thu, Dec 14, 2000 at 04:40:21PM +0000, Mat Bettinson wrote:
> Heya Holger,
>
> HK> No, REBOL caches and reuses the FTP control connection across
> HK> subsequent accesses to the same host and directory.
>
> But doesn't actually close any. Is there a way of forcing them closed?
I'll double-check with Sterling, but AFAIK current experimental versions
do close the control connection first if another connection to the same
host has to be opened. At least that is the behavior I get here. Which
version are you using ?
> HK> The only situation when REBOL creates a new control
> HK> connection is when you change directories. That is necessary because
> HK> FTP provides no standard way to remotely navigate through the
> HK> directory tree (in particular up the tree) without side effects.
>
> What, other than cd .. ? I don't get it.
cd ..
does not work if a component in the path is a softlink. E.g. if
you have just read "/foo/" and next want to read "/bar/" then "cd .."
does not always get you back to "/", because "/foo" might actually point
to "/a/b/c/d/", and then "cd .." would only get you to "/a/b/c/".
cd ~
works on some servers, but not all, "cd /" works on some servers, but
not all :-). There seems to be no way to get back to the initial login
directory (or to specify absolute path names relative to the login
directory) that works on all FTP servers.
> Assuming for a second that I buy this argument, and I'm really not
> sure I do,
Your choice :).
> then how would you recommend a recursive directory scan be
> implemented in Rebol?
Just do a sequence of reads.
--
Holger Kruse
[holger--rebol--com]
[7/9] from: sterling:rebol at: 14-Dec-2000 11:13
Yes. FTP will close the old connection to a given host if a new one
is made but in a different directory. That is also the same behavior
I see here with the latest experimental (2.4.39 on Linux). I also
watched my netstat and only one connection stayed active. The only
situation where you could end up with more than one connection to a
single host is if you are using two different users to log in as.
If you really want to make sure that all connections are closed
immediately then you can set the port cache size down to zero:
system/schemes/ftp/cache-size: 0
This will have the effect that the control connection is closed at the
end of the request and no ports at all will ever be cached by REBOL
FTP (not so efficient). If you set it to 1 then it should cache only
one connection. If you are only connecting to a single host as a
single user then you should see no difference in how it's all
working. All sequential reads within a single directory will reuse
the command port but access of a different directory will create a new
command port and close the last one.
If you are seeing different results... like multiple command ports
open to the same host, please send as much info as you can into
feedback as a bug report so we can track down the problem.
Sterling
[8/9] from: mat:eurogamer at: 14-Dec-2000 23:17
Heya sterling,
src> If you are seeing different results... like multiple command ports
src> open to the same host, please send as much info as you can into
src> feedback as a bug report so we can track down the problem.
If we've established that this is not the proper behavior then
absolutely, I can confirm that it does happen and that I can
demonstrate it.
--
Mat Bettinson - EuroGamer's Gaming Evangelist with a Goatee
http://www.eurogamer.net | http://www.eurogamer-network.com
[9/9] from: mat::eurogamer::net at: 14-Dec-2000 23:14
Heya Holger,
HK> I'll double-check with Sterling, but AFAIK current experimental versions
HK> do close the control connection first if another connection to the same
HK> host has to be opened. At least that is the behavior I get here. Which
HK> version are you using ?
Command, generally.
HK> "cd .." does not work if a component in the path is a softlink. E.g. if
HK> you have just read "/foo/" and next want to read "/bar/" then "cd .."
HK> does not always get you back to "/", because "/foo" might actually point
HK> to "/a/b/c/d/", and then "cd .." would only get you to "/a/b/c/".
Yes, I had a good think about it. I see what you mean. It's a real can
of worms. FTP clients even have work-arounds for different servers and
operating systems.
HK> "cd ~" works on some servers, but not all, "cd /" works on some servers, but
HK> not all :-).
If it supported a couple of the main ones (ServU, IIS, ProFTP etc) -
this would be a start.
>> then how would you recommend a recursive directory scan be
>> implemented in Rebol?
HK> Just do a sequence of reads.
Well that's the point, it wont close the threads and you get some
arcane internal error the instant your connection is refused because
of too many users logged on. Rather like that internal error FTP
writes started making for no good reason on Serv-U after day or so...
--
Mat Bettinson - EuroGamer's Gaming Evangelist with a Goatee
http://www.eurogamer.net | http://www.eurogamer-network.com
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted