Use pandas DataFrame. astype(int) and DataFrame. apply() methods to convert a column to int (float/string to integer/int64/int32 dtype) data type.
You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.
As @Jeff mentions there are a few ways to do this, but I recommend using loc/iloc to be more explicit (and raise errors early if you're trying something ambiguous):
In [10]: df = pd.DataFrame([[1, 2], [3, 4]], columns=['A', 'B'])
In [11]: df
Out[11]:
A B
0 1 2
1 3 4
In [12]: df[['A']]
In [13]: df[[0]]
In [14]: df.loc[:, ['A']]
In [15]: df.iloc[:, [0]]
Out[12-15]: # they all return the same thing:
A
0 1
1 3
The latter two choices remove ambiguity in the case of integer column names (precisely why loc/iloc were created). For example:
In [16]: df = pd.DataFrame([[1, 2], [3, 4]], columns=['A', 0])
In [17]: df
Out[17]:
A 0
0 1 2
1 3 4
In [18]: df[[0]] # ambiguous
Out[18]:
A
0 1
1 3
As Andy Hayden recommends, utilizing .iloc/.loc to index out (single-columned) dataframe is the way to go; another point to note is how to express the index positions. Use a listed Index labels/positions whilst specifying the argument values to index out as Dataframe; failure to do so will return a 'pandas.core.series.Series'
Input:
A_1 = train_data.loc[:,'Fraudster']
print('A_1 is of type', type(A_1))
A_2 = train_data.loc[:, ['Fraudster']]
print('A_2 is of type', type(A_2))
A_3 = train_data.iloc[:,12]
print('A_3 is of type', type(A_3))
A_4 = train_data.iloc[:,[12]]
print('A_4 is of type', type(A_4))
Output:
A_1 is of type <class 'pandas.core.series.Series'>
A_2 is of type <class 'pandas.core.frame.DataFrame'>
A_3 is of type <class 'pandas.core.series.Series'>
A_4 is of type <class 'pandas.core.frame.DataFrame'>
These three approaches have been mentioned:
pd.DataFrame(df.loc[:, 'A']) # Approach of the original post
df.loc[:,[['A']] # Approach 2 (note: use iloc for positional indexing)
df[['A']] # Approach 3
pd.Series.to_frame() is another approach.
Because it is a method, it can be used in situations where the second and third approaches above do not apply. In particular, it is useful when applying some method to a column in your dataframe and you want to convert the output into a dataframe instead of a series. For instance, in a Jupyter Notebook a series will not have pretty output, but a dataframe will.
# Basic use case:
df['A'].to_frame()
# Use case 2 (this will give you pretty output in a Jupyter Notebook):
df['A'].describe().to_frame()
# Use case 3:
df['A'].str.strip().to_frame()
# Use case 4:
def some_function(num):
...
df['A'].apply(some_function).to_frame()
You can use df.iloc[:, 0:1]
, in this case the resulting vector will be a DataFrame
and not series.
As you can see:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With