Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - Get dummies for only certain values

I have a Pandas series of 10000 rows which is populated with a single alphabet, starting from A to Z. However, I want to create dummy data frames for only A, B, and C, using Pandas get_dummies. How do I go around doing that?

I don't want to get dummies for all the row values in the column and then select the specific columns, as the column contains other redundant data which eventually causes a Memory Error.

like image 483
ExtremistEnigma Avatar asked Nov 03 '15 16:11

ExtremistEnigma


1 Answers

try this:

# create mock dataframe
df = pd.DataFrame( {'alpha':['a','a','b','b','c','e','f','g']})

# use replace with a regex to set characters d-z to None
pd.get_dummies(df.replace({'[^a-c]':None},regex =True))

output:

  alpha_a   alpha_b     alpha_c
0   1   0   0
1   1   0   0
2   0   1   0
3   0   1   0
4   0   0   1
5   0   0   0
6   0   0   0
7   0   0   0
like image 194
JAB Avatar answered Oct 27 '22 01:10

JAB