I am working on determining correlation for a set of data containing boolean values. The ideal situation would be to replace all instances of booleans with 1's and 0's. How can I most efficiently parse through my numPy array and replace these values? Bellow is what I have to work with and the output...
def findCorrelation(csvFileName):
data = pd.read_csv(csvFileName)
data = data.values
df = pd.DataFrame(data=data)
npList = np.asarray(df)
print npList
print df.corr()
Output:
[[320 True]
[400 False]
[350 True]
[360 True]
[340 True]
[340 True]
[425 False]
[380 False]
[365 True]]
Empty DataFrame
Columns: []
Index: []
Success
Process finished with exit code 0
The function you're looking for is astype
(documentation).
Example:
import numpy as np
a = np.asarray([[320, True], [400, False], [350, True], [360, True], [340, True], [340, True], [425, False], [380, False], [365, True]]).astype(int)
print (a)
Output:
[[320 1]
[400 0]
[350 1]
[360 1]
[340 1]
[340 1]
[425 0]
[380 0]
[365 1]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With