Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does .loc[:, ['A','B']] assignment allow to change the dtype of the columns?

Tags:

I was baffled because I could not modify two columns at once using .loc[:,['A', 'B'] which I guess is because it's returning a copy instead of a view. I cannot find in Indexing and Selecting Data a definitive guide on when it returns a view and when it returns a copy.

I'm using pandas 0.18 , I can see that in older version of the documentation (pandas 0.13) it used to say "Whenever an array of labels or a boolean vector are involved in the indexing operation, the result will be a copy" but I cannot find that in the current documentation

pd.__version__
# u'0.18.0'
df = pd.DataFrame({'A': ['1', '2', '3', '4',
                         '5', '6', '7', '8'],
                   'B': ['1', '2', '3', '4',
                         '5', '6', '7', '8'],
                   'C': ['1', '2', '3', '4',
                         '5', '6', '7', '8']})

df.dtypes
    #A    object
    #B    object
    #C    object
    #dtype: object

df2 = df.copy()
df2[['A', 'B']] = df2.loc[:,['A' , 'B']].astype(float) # Works
df2.dtypes
    #A    float64
    #B    float64
    #C     object
    #dtype: object
df2 = df.copy()
df2.loc[:,['A', 'B']] = df2.loc[:,['A' , 'B']].astype(float) # Does NOT work
df2.dtypes
    #A    object
    #B    object
    #C    object
    #dtype: object

None of those raise a SettingWithCopy warning. So I'm a little bit confused of why, the df2.loc[:, ['A', 'B']] assignment has no effect.

On closer inspection I do see that it's not a copy since in another test, I did assign a dataframe with different values and I they were "saved" in df2 but the the dtypes of the df2 cannot be "set" via the .loc[:, ['A', 'B']] assignment.

Is there any reason why .loc[:, ['A', 'B']] = assignment doesn't not change dtypes and [['A', 'B']] = does?

like image 385
RubenLaguna Avatar asked Jun 03 '16 16:06

RubenLaguna


1 Answers

There was actually just a issue and doc note added about this. Basically, .loc tries to cast back to the original dtype on assignment, where [] does not. It's the expected behavior, but a bit subtle.

like image 139
chrisb Avatar answered Sep 28 '22 02:09

chrisb