I have a pandas dataframe(df) like this
Close Close Close Close Close
Date
2000-01-03 00:00:00 NaN NaN NaN NaN -0.033944
2000-01-04 00:00:00 NaN NaN NaN NaN 0.0351366
2000-01-05 00:00:00 -0.033944 NaN NaN NaN -0.0172414
2000-01-06 00:00:00 0.0351366 -0.033944 NaN NaN -0.00438596
2000-01-07 00:00:00 -0.0172414 0.0351366 -0.033944 NaN 0.0396476
in R
If I want to select fifth column
five=df[,5]
and without 5th column
rest=df[,-5]
How can I do similar operations with pandas dataframe
I tried this in pandas
five=df.ix[,5]
but its giving this error
File "", line 1
df.ix[,5]
^
SyntaxError: invalid syntax
To select a particular number of rows and columns, you can do the following using . loc . To select a single value from the DataFrame, you can do the following. You can use slicing to select a particular column.
Using DataFrame.By using pandas. DataFrame. dropna() method you can drop columns with Nan (Not a Number) or None values from DataFrame. Note that by default it returns the copy of the DataFrame after removing columns.
Selecting columns based on their name This is the most basic way to select a single column from a dataframe, just put the string name of the column in brackets. Returns a pandas series. Passing a list in the brackets lets you select multiple columns at the same time.
Use iloc
. It is explicitly a position based indexer. ix
can be both and will get confused if an index is integer based.
df.iloc[:, [4]]
For all but the fifth column
slc = list(range(df.shape[1]))
slc.remove(4)
df.iloc[:, slc]
or equivalently
df.iloc[:, [i for i in range(df.shape[1]) if i != 4]]
If you want the fifth column:
df.ix[:,4]
Stick the colon in there to take all the rows for that column.
To exclude a fifth column you could try:
df.ix[:, (x for x in range(0, len(df.columns)) if x != 4)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With