I have found the following data set named ecoli.data and available in:
https://archive.ics.uci.edu/ml/machine-learning-databases/ecoli/
I would like to open it in R for making a classification task, but I would prefer to convert this document into a csv file. When I open it in word I notice that is not tab delimited, because there are like tree spaces between each row; so bottomline question is how to convert this file into csv using Excel or maybe Python.
Rename the file to ecoli.txt
then open it in Excel. This way you will be using the "Text Import Wizard" of Microsoft Excel that enables you to chose options like "Fixed width". Just click on "next" a few times and "finish" and you will have the data in the Excel grid. Now save it again as CSV.
Using Python 2.7:
import csv
with open('ecoli.data.txt') as input_file:
lines = input_file.readlines()
newLines = []
for line in lines:
newLine = line.strip().split()
newLines.append( newLine )
with open('output.csv', 'wb') as test_file:
file_writer = csv.writer(test_file)
file_writer.writerows( newLines )
Rename it in the file folder from
ecoli.data
to
ecoli.csv
Then you can use it in your code with the standard import code for csv, without any adding. No more to look back on it. It worked for me with adult.data!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With