I'm trying to read a .xlsx into a pandas dataframe usingpd.read_excel("C:/..."). 
The problem is I only get one column which include all the data seperated by ",".
|---| "Country","Year","Export" |  
|---|---------------------------|  
| 0 | Canada,2017,3002          |  
| 1 | Bulgaria,2016,3960        |  
| 2 | Germany,2015,3818         |
But this is not the format that I want... I'd like to get three columns like the table below.
|---| "Country"    | "Year"   | "Export"   |  
|---|--------------|----------| -----------|  
|1  | Canada       | 2017     |       3002 |  
|2  | Bulgaria     | 2016     |       3960 |  
|3  | Germany      | 2015     |       3818 |
So i am looking for the sep=',' or delimiter=',' parameter as contained in pd.read_csv. I allready worked through the documentation of pandas.read_excel but have not found a parameter which handle this problem...
Thanks!
One option is to save your .xlsx as a csv file. If you open this in a text editor you should see the annoying column is saved within quotes, but has values separated with commas like:
"Country,Year,Export",...  
"Canada,2017,3002",...
"Bulgaria,2016,3960",...        
"Germany,2015,3818",... 
Then you can read this file with pd.read_csv() and it will create a column named 'Country,Year,Export' which looks like
  Country,Year,Export
0    Canada,2017,3002
1  Bulgaria,2016,3960
2   Germany,2015,3818
You can then split this into separate columns with str.split()
df[['Country', 'Year', 'Export']] = pd.DataFrame(df['Country,Year,Export'].str.split(',').tolist())
  Country,Year,Export   Country  Year Export
0    Canada,2017,3002    Canada  2017   3002
1  Bulgaria,2016,3960  Bulgaria  2016   3960
2   Germany,2015,3818   Germany  2015   3818
                        Your Excel file might contain all its data in a single Excel column as comma-separated strings. read_excel doesn't have a sep= parameter, since it's designed to parse Excel files that don't use user-visible characters to demarcate columns.
You can either fix the Excel file, or fix the malformed DataFrame (named df) like this:
df2 = df.iloc[:, 0].str.split(',', expand=True)
df2.columns = [n.replace('"', '') for n in df.columns.str.split(',')[0]]
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With