Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python ftplib Corrupting Files?

I'm downloading files in Python using ftplib and up until recently everything seemed to be working fine. I am downloading files as such:

ftpSession = ftplib.FTP(host,username,password)
ftpSession.cwd('rlmfiles')
ftpFileList = filter(lambda x: 'PEDI' in x, ftpSession.nlst())
ftpFileList.sort() 
for f in ftpFileList:
    tempFile = open(os.path.join(localDirectory,f),'wb')
    ftpSession.retrbinary('RETR '+f,tempFile.write)
    tempFile.close()
ftpSession.quit()
sys.exit(0)

Up until recently it was downloading the files I needed just fine, as expected. Now, however, My files I'm downloading are corrupted and just contain long strings of garbage ASCII. I know that it is not the files posted onto the FTP I'm pulling them from because I also have a Perl script that does this successfully from the same FTP.

If it is any additional info, here's what the debugger puts out in the command prompt when downloading a file:

enter image description here

Has anyone encountered any issues with corrupted file contents using retrbinary() in Python's ftplib?

I'm really stuck/frustrated and haven't come across anything related to possible corruption here. Any help is appreciated.

like image 1000
Stephen Tetreault Avatar asked Nov 04 '22 00:11

Stephen Tetreault


1 Answers

I just ran into this issue yesterday when I was attempting to download text files. Not sure if that is what you were doing, but since you say it has ASCII garbage in it, I assume you opened it in a text editor because it was supposed to be text.

If this is the case, the problem is that the file is a text file and you are trying to download it in binary mode.

What you want to do instead is retrieve the file in ASCII transfer mode.

tempFile = open(os.path.join(localDirectory,f),'w')  # Changed 'wb' to 'w'
ftpSession.retrlines('RETR '+f,tempFile.write)       # Changed retrbinary to retrlines

Unfortunately, this strips all the new-line characters out of the file. Yuck!

So then you need to add the stripped out new-line characters again:

tempFile = open(os.path.join(localDirectory,f),'w')
textLines = []
ftpSession.retrlines('RETR '+f,textLines.append)
tempFile.write('\n'.join(textLines))

This should work, but it doesn't look as nice as it could. So a little cleanup effort would get us:

temporaryFile   = open(os.path.join(localDirectory, currentFile), 'w')
textLines       = []
retrieveCommand = 'RETR '

ftpSession.retrlines(retrieveCommand + currentFile, textLines.append)
temporaryFile.write('\n'.join(textLines))
like image 135
Hans Goldman Avatar answered Nov 15 '22 04:11

Hans Goldman