I'm trying to convert a string array of categorical variables to an integer array of categorical variables. Ex. <pre class="prettyprint"><code>import numpy as np a = np.array( ['a', 'b', 'c', 'a', 'b', 'c']) print a.dtype >>> |S1 b = np.unique(a) print b >>> ['a' 'b' 'c'] c = a.desired_function(b) print c, c.dtype >>> [1,2,3,1,2,3] int32 </code></pre> I realize this can be done with a loop but I imagine there is an easier way. Thanks.

np.unique has some optional returns return_inverse gives the integer encoding, which I use very often <pre class="prettyprint"><code>>>> b, c = np.unique(a, return_inverse=True) >>> b array(['a', 'b', 'c'], dtype='|S1') >>> c array([0, 1, 2, 0, 1, 2]) >>> c+1 array([1, 2, 3, 1, 2, 3]) </code></pre> it can be used to recreate the original array from uniques <pre class="prettyprint"><code>>>> b[c] array(['a', 'b', 'c', 'a', 'b', 'c'], dtype='|S1') >>> (b[c] == a).all() True </code></pre>

numpy convert categorical string arrays to an integer array

Tags:

I'm trying to convert a string array of categorical variables to an integer array of categorical variables.

Ex.

import numpy as np a = np.array( ['a', 'b', 'c', 'a', 'b', 'c']) print a.dtype >>> |S1  b = np.unique(a) print b >>>  ['a' 'b' 'c']  c = a.desired_function(b) print c, c.dtype >>> [1,2,3,1,2,3] int32

I realize this can be done with a loop but I imagine there is an easier way. Thanks.

457

asked Jul 03 '10 18:07

wroscoe

1 Answers

np.unique has some optional returns

return_inverse gives the integer encoding, which I use very often

>>> b, c = np.unique(a, return_inverse=True) >>> b array(['a', 'b', 'c'],        dtype='|S1') >>> c array([0, 1, 2, 0, 1, 2]) >>> c+1 array([1, 2, 3, 1, 2, 3])

it can be used to recreate the original array from uniques

>>> b[c] array(['a', 'b', 'c', 'a', 'b', 'c'],        dtype='|S1') >>> (b[c] == a).all() True

112

answered Sep 21 '22 17:09

Josef

Related questions
                            
                                Celery raises ValueError: not enough values to unpack
                            
                                Java BigDecimal remove decimal and trailing numbers
                            
                                Why does F# prefer lists over arrays?
                            
                                http://localhost:8000/broadcasting/auth 404 (Not Found)
                            
                                Node.getTextContent() is undefined in Node
                            
                                Laravel Gmail not working, "Username and Password not accepted. Learn more..."
                            
                                Android studio on mac: remove proxy settings
                            
                                What does a C process status mean in htop?
                            
                                Django urls uuid not working
                            
                                Why is the Content Header 'application/javascript' causing a 500 Error?
                            
                                Clear UIWebView content upon dismissal of modal view (iPhone OS 3.0)
                            
                                Flask-migrate and changing column type

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With