Pandas Dataframe ValueError: Shape of passed values is (X, ), indices imply (X, Y)

Tags:

I am getting an error and I'm not sure how to fix it.

The following seems to work:

def random(row):
   return [1,2,3,4]

df = pandas.DataFrame(np.random.randn(5, 4), columns=list('ABCD'))

df.apply(func = random, axis = 1)

and my output is:

[1,2,3,4]
[1,2,3,4]
[1,2,3,4]
[1,2,3,4]

However, when I change one of the of the columns to a value such as 1 or None:

def random(row):
   return [1,2,3,4]

df = pandas.DataFrame(np.random.randn(5, 4), columns=list('ABCD'))
df['E'] = 1

df.apply(func = random, axis = 1)

I get the the error:

ValueError: Shape of passed values is (5,), indices imply (5, 5)

I've been wrestling with this for a few days now and nothing seems to work. What is interesting is that when I change

def random(row):
   return [1,2,3,4]

def random(row):
   print [1,2,3,4]

everything seems to work normally.

This question is a clearer way of asking this question, which I feel may have been confusing.

My goal is to compute a list for each row and then create a column out of that.

EDIT: I originally start with a dataframe that hase one column. I add 4 columns in 4 difference apply steps, and then when I try to add another column I get this error.

952

asked Oct 29 '13 18:10

user1367204

1 Answers

If your goal is add new column to DataFrame, just write your function as function returning scalar value (not list), something like this:

>>> def random(row):
...     return row.mean()

and then use apply:

>>> df['new'] = df.apply(func = random, axis = 1)
>>> df
          A         B         C         D       new
0  0.201143 -2.345828 -2.186106 -0.784721 -1.278878
1 -0.198460  0.544879  0.554407 -0.161357  0.184867
2  0.269807  1.132344  0.120303 -0.116843  0.351403
3 -1.131396  1.278477  1.567599  0.483912  0.549648
4  0.288147  0.382764 -0.840972  0.838950  0.167222

I don't know if it possible for your new column to contain lists, but it deinitely possible to contain tuples ((...) instead of [...]):

>>> def random(row):
...    return (1,2,3,4,5)
...
>>> df['new'] = df.apply(func = random, axis = 1)
>>> df
          A         B         C         D              new
0  0.201143 -2.345828 -2.186106 -0.784721  (1, 2, 3, 4, 5)
1 -0.198460  0.544879  0.554407 -0.161357  (1, 2, 3, 4, 5)
2  0.269807  1.132344  0.120303 -0.116843  (1, 2, 3, 4, 5)
3 -1.131396  1.278477  1.567599  0.483912  (1, 2, 3, 4, 5)
4  0.288147  0.382764 -0.840972  0.838950  (1, 2, 3, 4, 5)

152

answered Sep 27 '22 09:09

Roman Pekar

Related questions
                            
                                TypeError: 'int' object is not callable,,, len()
                            
                                Tweepy: ImportError: cannot import name Random
                            
                                Getting only particular columns in every row in a numpy array [duplicate]
                            
                                Python: How RECURSIVELY remove None values from a NESTED data structure (lists and dictionaries)?
                            
                                Python unittest and test discovery
                            
                                Timeout a function (windows)?
                            
                                pandas attribute error : no attribute 'Factor' found
                            
                                Read a csv file from aws s3 using boto and pandas
                            
                                How to resolve the error, "module umap has no attribute UMAP".. I tried installing & reinstalling umap but didn't work to me
                            
                                Python 3, range().append() returns error: 'range' object has no attribute 'append'
                            
                                Padding or truncating a Python list
                            
                                Python: how to find common values in three lists
                            
                                How to check a variable is class object or not [duplicate]
                            
                                Change jupyter notebook server password
                            
                                How solve ImportError: No module named 'dbus'?
                            
                                no module named crypto.cipher
                            
                                How do I flatten a list of lists/nested lists? [duplicate]
                            
                                Kivy/Buildozer Import Error - pymssql.so is 64-bit instead of 32-bit
                            
                                Pass data frame through Tkinter classes
                            
                                Using multiple versions of Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas Dataframe ValueError: Shape of passed values is (X, ), indices imply (X, Y)

Tags:

pandas

dataframe

ipython

python-2.7

user1367204

People also ask

1 Answers

Roman Pekar

Recent Activity

Donate For Us