Apologies if this has been asked before, but I looked extensively without results.
import pandas as pd
import numpy as np
df = pd.DataFrame(data = np.random.randint(1,10,10),columns=['a'])
a
0 7
1 8
2 8
3 3
4 1
5 1
6 2
7 8
8 6
9 6
I'd like to create a new column b
that maps several values of a
according to some rule, say a=[1,2,3] is 1, a = [4,5,6,7] is 2, a = [8,9,10] is 3. one-to-one mapping is clear to me, but what if I want to map by a list of values or a range?
I tought along these lines...
df['b'] = df['a'].map({[1,2,3]:1,range(4,7):2,[8,9,10]:3})
pandas. map() is used to map values from two series having one column same. For mapping two series, the last column of the first series should be same as index column of the second series, also the values should be unique.
There are a few alternatives.
pd.cut
/ NumPy via np.digitize
You can construct a list of boundaries, then use specialist library functions. This is described in @EdChum's solution, and also in this answer.
np.select
df = pd.DataFrame(data=np.random.randint(1,10,10), columns=['a'])
criteria = [df['a'].between(1, 3), df['a'].between(4, 7), df['a'].between(8, 10)]
values = [1, 2, 3]
df['b'] = np.select(criteria, values, 0)
The elements of criteria
are Boolean series, so for lists of values, you can use df['a'].isin([1, 3])
, etc.
range
d = {range(1, 4): 1, range(4, 8): 2, range(8, 11): 3}
df['c'] = df['a'].apply(lambda x: next((v for k, v in d.items() if x in k), 0))
print(df)
a b c
0 1 1 1
1 7 2 2
2 5 2 2
3 1 1 1
4 3 1 1
5 5 2 2
6 4 2 2
7 4 2 2
8 9 3 3
9 3 1 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With