I'm trying to crawl FTP and pull down all the files recursively.
Up until now I was trying to pull down a directory with
ftp.list.each do |entry|
if entry.split(/\s+/)[0][0, 1] == "d"
out[:dirs] << entry.split.last unless black_dirs.include? entry.split.last
else
out[:files] << entry.split.last unless black_files.include? entry.split.last
end
But turns out, if you split the list up until last space, filenames and directories with spaces are fetched wrong. Need a little help on the logic here.
You can avoid recursion if you list all files at once
files = ftp.nlst('**/*.*')
Directories are not included in the list but the full ftp path is still available in the name.
EDIT
I'm assuming that each file name contains a dot and directory names don't. Thanks for mentioning @Niklas B.
There are a huge variety of FTP servers around.
We have clients who use some obscure proprietary, Windows-based servers and the file listing returned by them look completely different from Linux versions.
So what I ended up doing is for each file/directory entry I try changing directory into it and if this doesn't work - consider it a file :)
The following method is "bullet proof":
# Checks if the give file_name is actually a file.
def is_ftp_file?(ftp, file_name)
ftp.chdir(file_name)
ftp.chdir('..')
false
rescue
true
end
file_names = ftp.nlst.select {|fname| is_ftp_file?(ftp, fname)}
Works like a charm, but please note: if the FTP directory has tons of files in it - this method takes a while to traverse all of them.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With