Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas convert_to_r_dataframe does not work with numpy.bool_

I have a pandas data frame that I would like to convert to an R data frame to use via rpy2. The data types of pandas data frame are booleans, specifically numpy.bool_. I get a KeyError when trying to use convert_to_r_dataframe. I am using pandas 0.13.1.

I am doing something I should not be doing? Should I not be using numpy booleans?

Here is an example,

import pandas
import pandas.rpy.common as common
import numpy as np

# This works fine.
test_df_float = pandas.DataFrame(np.random.rand(10, 3), columns=list('xyz'))
r_test_df_float = common.convert_to_r_dataframe(test_df_float)

# This is a problem.
test_df_bool = pandas.DataFrame(np.random.rand(10, 3) > 0.5, columns=list('xyz'))
r_test_df_bool = common.convert_to_r_dataframe(test_df_bool)

KeyError                                  Traceback (most recent call last)
<ipython-input-11-323084399e95> in <module>()
----> 1 r_test_df_bool = common.convert_to_r_dataframe(test_df_bool)

/usr/lib/python2.7/site-packages/pandas/rpy/common.pyc in convert_to_r_dataframe(df, strings_as_factors)
311                      for item in value]
312 
--> 313             value = VECTOR_TYPES[value_type](value)
314 
315             if not strings_as_factors:

KeyError: <type 'numpy.bool_'>
like image 408
mjandrews Avatar asked Dec 14 '25 01:12

mjandrews


1 Answers

I think this may be a bug, what is used to be np.bool now is called np.bool_ and the key is missing for two dictionary in the source file, so modify the source (line 261 in .../site-packages/pandas/rpy/common.py) to the following will do the trick:

VECTOR_TYPES = {np.float64: robj.FloatVector,
                np.float32: robj.FloatVector,
                np.float: robj.FloatVector,
                np.int: robj.IntVector,
                np.int32: robj.IntVector,
                np.int64: robj.IntVector,
                np.object_: robj.StrVector,
                np.str: robj.StrVector,
                np.bool: robj.BoolVector,
                np.bool_: robj.BoolVector} #new key

NA_TYPES = {np.float64: robj.NA_Real,
            np.float32: robj.NA_Real,
            np.float: robj.NA_Real,
            np.int: robj.NA_Integer,
            np.int32: robj.NA_Integer,
            np.int64: robj.NA_Integer,
            np.object_: robj.NA_Character,
            np.str: robj.NA_Character,
            np.bool: robj.NA_Logical,
            np.bool_: robj.NA_Logical} #new key

Basically you just need to add the last key into both dictionarys.

like image 111
CT Zhu Avatar answered Dec 15 '25 14:12

CT Zhu



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!