I have a very large CSV File with 100 columns. In order to illustrate my problem I will use a very basic example.
Let's suppose that we have a CSV file.
in value d f 0 975 f01 5 1 976 F 4 2 977 d4 1 3 978 B6 0 4 979 2C 0
I want to select a specific columns.
import pandas
data = pandas.read_csv("ThisFile.csv")
In order to select the first 2 columns I used
data.ix[:,:2]
In order to select different columns like the 2nd and the 4th. What should I do?
There is another way to solve this problem by re-writing the CSV file. But it's huge file; So I am avoiding this way.
read_csv() to filter columns from a CSV file. Call pandas. read_csv(filepath_or_buffer, usecols=headers) with filepath_or_buffer as the name of a CSV file and headers as a list of column headers from the file to create a pandas. DataFrame with only those columns.
You can use the filter function of the pandas dataframe to select columns containing a specified string in column names. The parameter like of the . filter function defines this specific string. If a column name contains the string specified, that column will be selected and dataframe will be returned.
This is the most basic way to select a single column from a dataframe, just put the string name of the column in brackets. Returns a pandas series. Passing a list in the brackets lets you select multiple columns at the same time.
This selects the second and fourth columns (since Python uses 0-based indexing):
In [272]: df.iloc[:,(1,3)]
Out[272]:
value f
0 975 5
1 976 4
2 977 1
3 978 0
4 979 0
[5 rows x 2 columns]
df.ix
can select by location or label. df.iloc
always selects by location. When indexing by location use df.iloc
to signal your intention more explicitly. It is also a bit faster since Pandas does not have to check if your index is using labels.
Another possibility is to use the usecols
parameter:
data = pandas.read_csv("ThisFile.csv", usecols=[1,3])
This will load only the second and fourth columns into the data
DataFrame.
If you rather select column by name, you can use
data[['value','f']]
value f
0 975 5
1 976 4
2 977 1
3 978 0
4 979 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With