Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can't execute Python Pandas set_value

Tags:

python

pandas

csv

Got a problem with Pandas in Python 3.5

I read local csv using Pandas, the csv contains pure data, no header involved. Then I assigned column name using

df= pd.read_csv(filePath, header=None)
df.columns=['XXX', 'XXX'] #for short, totally 11 cols

The csv has 11 columns, one of them is string, others are integer.

Then I tried to replace string column by integer value in a loop, cell by cell

for i, row in df.iterrows():
    print(i, row['Name'])
    df.set_value(i, 'Name', 123)

intrger 123 is an example, not every cell under this column is 123. print function works well if I remove set_value, but with

df.set_value(i, 'Name', 123)

Then error info:

Traceback (most recent call last): File "D:/xxx/test.py", line 20, in df.set_value(i, 'Name', 233)

File "E:\Users\XXX\Anaconda3\lib\site-packages\pandas\core\frame.py", line 1862, in set_value series = self._get_item_cache(col)

File "E:\Users\XXX\Anaconda3\lib\site-packages\pandas\core\generic.py", line 1351, in _get_item_cache res = self._box_item_values(item, values)

File "E:\Users\XXX\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2334, in _box_item_values

return self._constructor(values.T, columns=items, index=self.index)

AttributeError: 'BlockManager' object has no attribute 'T'

But if I create a dataframe manually in code

df = pd.DataFrame(index=[0, 1, 2], columns=['x', 'y'])
df['x'] = 2
df['y'] = 'BBB'
print(df)
for i, row in df.iterrows():
    df.set_value(i, 'y', 233)


print('\n')
print(df)

It worked. I am wondering maybe there is something I am missing?

Thanks!

like image 725
Windtalker Avatar asked May 30 '16 20:05

Windtalker


1 Answers

The cause of the original error:

Pandas DataFrame set_value(index, col, value) method will return the posted obscure AttributeError: 'BlockManager' object has no attribute 'T' when the dataframe being modified has duplicate column names.

The error can be reproduced using the code above by @Windtalker where the only change made is that the column names are now both 'x' rather than 'x' and 'y'.

import pandas as pd
df = pd.DataFrame(index=[0, 1, 2], columns=['x', 'x'])
df['x'] = 2
df['y'] = 'BBB'
print(df)
for i, row in df.iterrows():
    df.set_value(i, 'y', 233)

print('\n')
print(df)

Hopefully this helps someone else diagnose the same issue.

like image 70
TheRoman Avatar answered Sep 21 '22 00:09

TheRoman