I am using the following code to convert .xlsx files into .csv files.
import pandas as pd data_xls = pd.read_excel('excelfile.xlsx', 'Sheet2', index_col=None) data_xls.to_csv('csvfile.csv', encoding='utf-8')
The code is working, however I am getting an index column with the cell numbers which I do not want. Is there anyway to not include or remove that index column?
File output
Unnamed Data 0 0.99319613 1 0.99319613 2 0.99319613 3 0.99319613 4 0.99319613 5 0.99319613
pandas DataFrame to CSV with no index can be done by using index=False param of to_csv() method. With this, you can specify ignore index while writing/exporting DataFrame to CSV file.
We can remove the index column in existing dataframe by using reset_index() function. This function will reset the index and assign the index columns start with 0 to n-1. where n is the number of rows in the dataframe.
As noted in the docs for pandas.DataFrame.to_csv()
, simply pass index=False
as a keyword argument, to exclude row names.
data_xls.to_csv('csvfile.csv', encoding='utf-8', index=False)
Inspired by miradulo and fix a number conversion problem:
import pandas as pd data_xls = pd.read_excel('excelfile.xlsx', 'Sheet2', dtype=str, index_col=None) data_xls.to_csv('csvfile.csv', encoding='utf-8', index=False)
Can drop 'Sheet2' if there is one sheet. dtype=str to avoid number conversion.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With