I have a simple 2 column csv file called st1.csv:
GRID St1 1457 614 1458 657 1459 679 1460 732 1461 754 1462 811 1463 748
However, when I try to read the csv file, the first column is not loaded:
a = pandas.DataFrame.from_csv('st1.csv') a.columns
outputs:
Index([u'ST1'], dtype=object)
Why is the first column not being read?
Use pandas. read_csv() to read a specific column from a CSV file. To read a CSV file, call pd. read_csv(file_name, usecols=cols_list) with file_name as the name of the CSV file, delimiter as the delimiter, and cols_list as the list of specific columns to read from the CSV file.
Method 4: Pandas To read the first n lines of a file, you can use the pandas call pd. read_csv(filename, nrows=n) .
Judging by your data it looks like the delimiter you're using is a
.
Try the following:
a = pandas.DataFrame.from_csv('st1.csv', sep=' ')
The other issue is that it's assuming your first column is an index, which we can also disable:
a = pandas.DataFrame.from_csv('st1.csv', index_col=None)
UPDATE:
In newer pandas versions, do:
a = pandas.DataFrame.from_csv('st1.csv', index_col=False)
For newer versions of pandas, pd.DataFrame.from_csv
doesn't exist anymore, and index_col=None
no longer does the trick with pd.read_csv
. You'll want to use pd.read_csv
with index_col=False
instead:
pd.read_csv('st1.csv', index_col=False)
Example:
(so) URSA-MattM-MacBook:stackoverflow mmessersmith$ cat input.csv Date Employee Operation Order 2001-01-01 08:32:17 User1 Approved #00045 2001-01-01 08:36:23 User1 Edited #00045 2001-01-01 08:41:04 User1 Rejected #00046 2001-01-01 08:42:56 User1 Deleted #00046 2001-01-02 09:01:11 User1 Created #00047 2019-10-03 17:23:45 User1 Approved #72681 (so) URSA-MattM-MacBook:stackoverflow mmessersmith$ python Python 3.7.4 (default, Aug 13 2019, 15:17:50) [Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import pandas as pd >>> pd.__version__ '0.25.1' >>> df_bad_index = pd.read_csv('input.csv', delim_whitespace=True) >>> df_bad_index Date Employee Operation Order 2001-01-01 08:32:17 User1 Approved #00045 2001-01-01 08:36:23 User1 Edited #00045 2001-01-01 08:41:04 User1 Rejected #00046 2001-01-01 08:42:56 User1 Deleted #00046 2001-01-02 09:01:11 User1 Created #00047 2019-10-03 17:23:45 User1 Approved #72681 >>> df_bad_index.index Index(['2001-01-01', '2001-01-01', '2001-01-01', '2001-01-01', '2001-01-02', '2019-10-03'], dtype='object') >>> df_still_bad_index = pd.read_csv('input.csv', delim_whitespace=True, index_col=None) >>> df_still_bad_index Date Employee Operation Order 2001-01-01 08:32:17 User1 Approved #00045 2001-01-01 08:36:23 User1 Edited #00045 2001-01-01 08:41:04 User1 Rejected #00046 2001-01-01 08:42:56 User1 Deleted #00046 2001-01-02 09:01:11 User1 Created #00047 2019-10-03 17:23:45 User1 Approved #72681 >>> df_still_bad_index.index Index(['2001-01-01', '2001-01-01', '2001-01-01', '2001-01-01', '2001-01-02', '2019-10-03'], dtype='object') >>> df_good_index = pd.read_csv('input.csv', delim_whitespace=True, index_col=False) >>> df_good_index Date Employee Operation Order 0 2001-01-01 08:32:17 User1 Approved 1 2001-01-01 08:36:23 User1 Edited 2 2001-01-01 08:41:04 User1 Rejected 3 2001-01-01 08:42:56 User1 Deleted 4 2001-01-02 09:01:11 User1 Created 5 2019-10-03 17:23:45 User1 Approved >>> df_good_index.index RangeIndex(start=0, stop=6, step=1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With