Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to convert all columns from numeric to categorical using Python

There are 51 columns in my .csv file, I need to convert all int 64 data types to categorical in one go.How can I do that? Do I need to mention all the column names in data[].

 data[].astype('categorical')
like image 719
1111 Avatar asked Oct 08 '16 03:10

1111


People also ask

How can I convert categorical data to numeric data in Python?

Fortunately, the python tools of pandas and scikit-learn provide several approaches that can be applied to transform the categorical data into suitable numeric values.

How to convert integer or character column to categorical in pandas?

Categorical function is used to convert / typecast integer or character column to categorical in pandas python. Typecast a numeric column to categorical using categorical function ().

What is an example of categorical variable in Python?

For example, We will take a dataset of people’s salaries based on their level of education. This is an ordinal type of categorical variable. We will convert their education levels into numeric terms. replace (to_replace=None, value=None, inplace=False, limit=None, regex=False, method=’pad’)

How to encode categorical values in Python?

Guide to Encoding Categorical Values in Python Introduction The Data Set Approach #1 - Find and Replace Approach #2 - Label Encoding Approach #3 - One Hot Encoding Approach #4 - Custom Binary Encoding Scikit-Learn Advanced Approaches Conclusion


1 Answers

You can get the column names into a list, then loop to change the type of each column.

import pandas as pd
import numpy as np

# create example dataframe
cats = ['A', 'B', 'C', 'D', 'E']

int_matrix = np.random.randint(10, size=(7,5))

df = pd.DataFrame(data = int_matrix, columns=cats)

print("Original example data\n")
print(df)
print(df.dtypes)

# get column names of data frame in a list
col_names = list(df)
print("\nNames of dataframe columns")
print(col_names)

# loop to change each column to category type
for col in col_names:
    df[col] = df[col].astype('category',copy=False)

print("\nExample data changed to category type")
print(df)
print(df.dtypes)

The output of this little program is:

Original example data

   A  B  C  D  E
0  0  4  9  2  9
1  2  5  2  4  1
2  1  1  0  5  7
3  1  2  5  4  0
4  9  2  6  5  3
5  3  3  2  1  7
6  6  0  8  7  3
A    int32
B    int32
C    int32
D    int32
E    int32
dtype: object

Names of dataframe columns
['A', 'B', 'C', 'D', 'E']

Example data changed to category type
   A  B  C  D  E
0  0  4  9  2  9
1  2  5  2  4  1
2  1  1  0  5  7
3  1  2  5  4  0
4  9  2  6  5  3
5  3  3  2  1  7
6  6  0  8  7  3
A    category
B    category
C    category
D    category
E    category
dtype: object
like image 62
blackeneth Avatar answered Oct 20 '22 17:10

blackeneth