Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python pandas: Why does df.iloc[:, :-1].values for my training data select till only the second last column?

Tags:

python

pandas

Very simply put,

For the same training data frame df, when I use X = df.iloc[:, :-1].values, it will select till the second last column of the data frame instead of the last column (which is what I want BUT it's a strange behavior I've never seen before), and I know this as the second last column's value and the last column's value for that row is different.

However, using y = df.iloc[:, -1].values gives me the row vector of the last column's values which is exactly what I want.

Why is the negative 1 for X giving me the second last column's value instead?

Error

like image 291
kwotsin Avatar asked May 29 '16 16:05

kwotsin


People also ask

What is the meaning of DF ILOC [- 2 in python?

df. iloc[:, 2] selects the second column but df. iloc[:, :2] or explicitly df. iloc[:, 0:2] selects the columns until (excluding) the second column. It's the same as Python's slices.

What does the pandas ILOC () function do?

The iloc() function in python is defined in the Pandas module, which helps us select a specific row or column from the data set. Using the iloc method in python, we can easily retrieve any particular value from a row or column by using index values.

How do I select the last column in pandas using ILOC?

Use iloc[] to select last column of pandas dataframe row_start: The row index/position from where it should start selection. Default is 0. row_end: The row index/position from where it should end the selection i.e. select till row_end-1. Default is till the last row of the dataframe.

How do I get rid of pandas indexing?

Dropping a Pandas Index Column Using reset_index The most straightforward way to drop a Pandas dataframe index is to use the Pandas . reset_index() method. By default, the method will only reset the index, forcing values from 0 - len(df)-1 as the index.


1 Answers

I think you have only two columns in df, because if there is more columns, iloc select all columns without last:

df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6],
                   'C':[7,8,9],
                   'D':[1,3,5],
                   'E':[5,3,6],
                   'F':[7,4,3]})

print (df)
   A  B  C  D  E  F
0  1  4  7  1  5  7
1  2  5  8  3  3  4
2  3  6  9  5  6  3

print(df.iloc[:, :-1])
   A  B  C  D  E
0  1  4  7  1  5
1  2  5  8  3  3
2  3  6  9  5  6

X = df.iloc[:, :-1].values
print (X)
[[1 4 7 1 5]
 [2 5 8 3 3]
 [3 6 9 5 6]]

print (X.shape)
(3, 5)
like image 144
jezrael Avatar answered Sep 28 '22 06:09

jezrael