I have two lists of files that I'm pulling from an FTP folder using:
sFiles = ftp.nlst(date+'sales.csv')
oFiles = ftp.nlst(date+'orders.csv')
This results with two lists looking something like:
sFiles = ['20170822_sales.csv','20170824_sales.csv','20170825_sales.csv','20170826_sales.csv','20170827_sales.csv','20170828_sales.csv']
oFiles = ['20170822_orders.csv','20170823_orders.csv','20170824_orders.csv','20170825_orders.csv','20170826_orders.csv','20170827_orders.csv']
With my real data-set, something like...
for sales, orders in zip(sorted(sFiles),sorted(oFiles)):
df = pd.concat(...)
Gets my desired result, but there are going to be times where something goes wrong and both files do not make it into the proper FTP folder, so I'd like some code that will create an iterable object where I can extract the matched orders and sales file name based on date.
The following works... I'm not sure what "pythonic" score I'd give it. Poor readability, but it is a comprehension, so I'd imagine there are performance gains?
[(sales, orders) for sales in sFiles for orders in oFiles if re.search(r'\d+',sales).group(0) == re.search(r'\d+',orders).group(0)]
Taking advantage of the index of the pandas DataFrame:
import pandas as pd
sFiles = ['20170822_sales.csv','20170824_sales.csv','20170825_sales.csv','20170826_sales.csv','20170827_sales.csv','20170828_sales.csv']
oFiles = ['20170822_orders.csv','20170823_orders.csv','20170824_orders.csv','20170825_orders.csv','20170826_orders.csv','20170827_orders.csv']
s_dates = [pd.Timestamp.strptime(file[:8], '%Y%m%d') for file in sFiles]
s_df = pd.DataFrame({'sFiles': sFiles}, index=s_dates)
o_dates = [pd.Timestamp.strptime(file[:8], '%Y%m%d') for file in oFiles]
o_df = pd.DataFrame({'oFiles': oFiles}, index=o_dates)
df = s_df.join(o_df, how='outer')
and so:
>>> print(df)
sFiles oFiles
2017-08-22 20170822_sales.csv 20170822_orders.csv
2017-08-23 NaN 20170823_orders.csv
2017-08-24 20170824_sales.csv 20170824_orders.csv
2017-08-25 20170825_sales.csv 20170825_orders.csv
2017-08-26 20170826_sales.csv 20170826_orders.csv
2017-08-27 20170827_sales.csv 20170827_orders.csv
2017-08-28 20170828_sales.csv NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With