I have a requirement where I have to pull the latest files from an FTP folder, the problem is that the filename is having spaces and the filename is having a specific pattern. Below is the code I have implemented:
import sys
from ftplib import FTP
import os
import socket
import time
import pandas as pd
import numpy as np
from glob import glob
import datetime as dt
from __future__ import with_statement
ftp = FTP('')
ftp.login('','')
ftp.cwd('')
ftp.retrlines('LIST')
filematch='*Elig.xlsx'
downloaded = []
for filename in ftp.nlst(filematch):
fhandle=open(filename, 'wb')
print 'Getting ' + filename
ftp.retrbinary('RETR '+ filename, fhandle.write)
fhandle.close()
downloaded.append(filename)
ftp.quit()
I understand that I can append an empty list to ftp.dir() command, but since the filename is having spaces, I am unable to split it in the right way and pick the latest file of the type that I have mentined above.
Any help would be great.
You can get the file mtime by sending the MDTM command iff the FTP server supports it and sort the files on the FTP server accordingly.
def get_newest_files(ftp, limit=None):
"""Retrieves newest files from the FTP connection.
:ftp: The FTP connection to use.
:limit: Abort after yielding this amount of files.
"""
files = []
# Decorate files with mtime.
for filename in ftp.nlst():
response = ftp.sendcmd('MDTM {}'.format(filename))
_, mtime = response.split()
files.append((mtime, filename))
# Sort files by mtime and break after limit is reached.
for index, decorated_filename in enumerate(sorted(files, reverse=True)):
if limit is not None and index >= limit:
break
_, filename = decorated_filename # Undecorate
yield filename
downloaded = []
# Retrieves the newest file from the FTP server.
for filename in get_newest_files(ftp, limit=1):
print 'Getting ' + filename
with open(filename, 'wb') as file:
ftp.retrbinary('RETR '+ filename, file.write)
downloaded.append(filename)
The issue is that the FTP "LIST" command returns text for humans, which format depends on the FTP server implementation.
Using PyFilesystem (in place of the standard ftplib) and its API will provide a "list" API (search "walk") that provide Pythonic structures of the file and directories lists hosted in the FTP server.
http://pyfilesystem2.readthedocs.io/en/latest/index.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With