Good morning chaps,
Any pythonic way to explode a dataframe column into multiple columns with boolean flags, based on some condition (str.contains in this case)?
Let's say I have this:
Position Letter
1 a
2 b
3 c
4 b
5 b
And I'd like to achieve this:
Position Letter is_a is_b is_C
1 a TRUE FALSE FALSE
2 b FALSE TRUE FALSE
3 c FALSE FALSE TRUE
4 b FALSE TRUE FALSE
5 b FALSE TRUE FALSE
Can do with a loop through 'abc' and explicitly creating new df columns, but wondering if some built-in method already exists in pandas. Number of possible values, and hence number of new columns is variable.
Thanks and regards.
In Pandas, the apply() method can also be used to split one column values into multiple columns. The DataFrame. apply method() can execute a function on all values of single or multiple columns. Then inside that function, we can split the string value to multiple values.
You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.
Pandas DataFrame: explode() functionThe explode() function is used to transform each element of a list-like to a row, replicating the index values. Exploded lists to rows of the subset columns; index will be duplicated for these rows. Raises: ValueError - if columns of the frame are not unique.
use Series.str.get_dummies():
In [31]: df.join(df.Letter.str.get_dummies())
Out[31]:
Position Letter a b c
0 1 a 1 0 0
1 2 b 0 1 0
2 3 c 0 0 1
3 4 b 0 1 0
4 5 b 0 1 0
or
In [32]: df.join(df.Letter.str.get_dummies().astype(bool))
Out[32]:
Position Letter a b c
0 1 a True False False
1 2 b False True False
2 3 c False False True
3 4 b False True False
4 5 b False True False
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With