Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas data frame transform INT64 columns to boolean

Some column in dataframe df, df.column, is stored as datatype int64.

The values are all 1s or 0s.

Is there a way to replace these values with boolean values?

like image 622
user1893148 Avatar asked Sep 11 '13 18:09

user1893148


People also ask

What is Dtype int64 in Python?

dtype. dtype('int64') The type int64 tells us that Python is storing each value within this column as a 64 bit integer. We can use the dat. dtypes command to view the data type for each column in a DataFrame (all at once).

How do you convert a DataFrame column into a data type?

to_numeric() The best way to convert one or more columns of a DataFrame to numeric values is to use pandas. to_numeric() . This function will try to change non-numeric objects (such as strings) into integers or floating-point numbers as appropriate.

How do I change the data type of a Pandas series?

Change data type of a series in Pandas Use a numpy. dtype or Python type to cast entire pandas object to the same type. Alternatively, use {col: dtype, …}, where col is a column label and dtype is a numpy. dtype or Python type to cast one or more of the DataFrame's columns to column-specific types.

How do I convert a column to Panda categorical?

astype() method is used to cast a pandas object to a specified dtype. astype() function also provides the capability to convert any suitable existing column to categorical type. DataFrame. astype() function comes very handy when we want to case a particular column data type to another data type.


1 Answers

df['column_name'] = df['column_name'].astype('bool') 

For example:

import pandas as pd import numpy as np df = pd.DataFrame(np.random.random_integers(0,1,size=5),                    columns=['foo']) print(df) #    foo # 0    0 # 1    1 # 2    0 # 3    1 # 4    1  df['foo'] = df['foo'].astype('bool') print(df) 

yields

     foo 0  False 1   True 2  False 3   True 4   True 

Given a list of column_names, you could convert multiple columns to bool dtype using:

df[column_names] = df[column_names].astype(bool) 

If you don't have a list of column names, but wish to convert, say, all numeric columns, then you could use

column_names = df.select_dtypes(include=[np.number]).columns df[column_names] = df[column_names].astype(bool) 
like image 199
unutbu Avatar answered Sep 18 '22 23:09

unutbu