Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to apply a Pandas lookup table to a numpy array?

I have a pandas Series like this:

      measure
0    0.3
6    0.6
9    0.2
11   0.3
14   0.0
17   0.1
23   0.9

and a numpy array like this:

array([[ 0,  0,  9, 11],
       [ 6, 14,  6, 17]])

How can I do a lookup from the values in the numpy array to the indices in the series to get this:

array([[ 0.3,  0.3,  0.2, 0.3],
       [ 0.6,  0.0,  0.6, 0.1]])
like image 911
ajwood Avatar asked Feb 01 '18 02:02

ajwood


People also ask

Can we convert pandas DataFrame to NumPy array?

You can convert pandas dataframe to numpy array using the df. to_numpy() method. Numpy arrays provide fast and versatile ways to normalize data that can be used to clean and scale the data during the training of the machine learning models.

How does pandas and NumPy work together?

The Pandas module mainly works with the tabular data, whereas the NumPy module works with the numerical data. The Pandas provides some sets of powerful tools like DataFrame and Series that mainly used for analyzing the data, whereas in NumPy module offers a powerful object called Array.

Can you use NumPy on pandas DataFrame?

Pandas expands on NumPy by providing easy to use methods for data analysis to operate on the DataFrame and Series classes, which are built on NumPy's powerful ndarray class.


4 Answers

Via np.vectorize, with series s and array a:

np.vectorize(s.get)(a)
like image 170
jpp Avatar answered Oct 24 '22 12:10

jpp


Using replace

a=np.array([[ 0,  0,  9, 11],
       [ 6, 14,  6, 17]])
pd.DataFrame(a).replace(df.measure.to_dict()).values
Out[214]: 
array([[0.3, 0.3, 0.2, 0.3],
       [0.6, 0. , 0.6, 0.1]])
like image 38
BENY Avatar answered Oct 24 '22 12:10

BENY


Interesting way using np.bincount

np.bincount(s.index.values, s.values)[a]

array([[ 0.3,  0.3,  0.2,  0.3],
       [ 0.6,  0. ,  0.6,  0.1]])

Setup

s = pd.Series(
    [.3, .6, .2, .3, .0, .1, .9],
    [0, 6, 9, 11, 14, 17, 23]
)

a = np.array([
    [0, 0, 9, 11],
    [6, 14, 6, 17]
])
like image 37
piRSquared Avatar answered Oct 24 '22 12:10

piRSquared


You can use loc and reshape:

s = pd.Series({0: 0.3, 6: 0.6, 9: 0.2, 11: 0.3, 14: 0.0, 17: 0.1, 23: 0.9})

a = np.array([[ 0,  0,  9, 11],
             [ 6, 14,  6, 17]])

s.loc[a.flatten()].values.reshape(a.shape)
Out[192]: 
array([[ 0.3,  0.3,  0.2,  0.3],
       [ 0.6,  0. ,  0.6,  0.1]])

Or:

pd.DataFrame(a).applymap(lambda x: s.loc[x]).values
Out[200]: 
array([[ 0.3,  0.3,  0.2,  0.3],
       [ 0.6,  0. ,  0.6,  0.1]])
like image 31
Allen Avatar answered Oct 24 '22 11:10

Allen