So I currently have a dataframe that looks like: <img src="https://i.stack.imgur.com/QW7hp.png" alt="Current Dataframe"> And I want to add a completely new column called "Predictors" with only one cell that contains an array. So [0, 'Predictors'] should contain an array and everything below that cell in the same column should be empty. Here's my attempt, I tried to create a separate dataframe that just contained the "Predictors" column, and tried appending it to the current dataframe, but I get: 'Length mismatch: Expected axis has 3 elements, new values have 4 elements.' How do I append a single cell containing an array to my dataframe? <pre class="prettyprint"><code># create a list and dataframe to hold the names of predictors dataframe=dataframe.drop(['price','Date'],axis=1) predictorsList = dataframe.columns.get_values().tolist() predictorsList = np.array(predictorsList, dtype=object) # Combine actual and forecasted lists to one dataframe combinedResults = pd.DataFrame({'Actual': actual, 'Forecasted': forecasted}) predictorsDF = pd.DataFrame({'Predictors': [predictorsList]}) # Add Predictors to dataframe #combinedResults.at[0, 'Predictors'] = predictorsList pd.concat([combinedResults,predictorsDF], ignore_index=True, axis=1) </code></pre>

You could fill the rest of the cells in the desired column with <code>NaN</code>, but they will not "empty". To do that, use <code>pd.merge</code> on both indexes: Setup <pre class="prettyprint"><code>import pandas as pd import numpy as np df = pd.DataFrame({ 'Actual': [18.442, 15.4233, 20.6217, 16.7, 18.185], 'Forecasted': [19.6377, 13.1665, 19.3992, 17.4557, 14.0053] }) arr = np.zeros(3) df_arr = pd.DataFrame({'Predictors': [arr]}) </code></pre> Merging df and df_arr <pre class="prettyprint"><code>result = pd.merge( df, df_arr, how='left', left_index=True, # Merge on both indexes, since right only has 0... right_index=True # all the other rows will be NaN ) </code></pre> Results <pre class="prettyprint"><code>>>> print(result) Actual Forecasted Predictors 0 18.4420 19.6377 [0.0, 0.0, 0.0] 1 15.4233 13.1665 NaN 2 20.6217 19.3992 NaN 3 16.7000 17.4557 NaN 4 18.1850 14.0053 NaN >>> result.loc[0, 'Predictors'] array([0., 0., 0.]) >>> result.loc[1, 'Predictors'] # actually contains a NaN value nan </code></pre>

Pandas how to place an array in a single dataframe cell?

Tags:

python

pandas

dataframe

statistics

data-science

So I currently have a dataframe that looks like:

Current Dataframe

And I want to add a completely new column called "Predictors" with only one cell that contains an array.

So [0, 'Predictors'] should contain an array and everything below that cell in the same column should be empty.

Here's my attempt, I tried to create a separate dataframe that just contained the "Predictors" column, and tried appending it to the current dataframe, but I get: 'Length mismatch: Expected axis has 3 elements, new values have 4 elements.'

How do I append a single cell containing an array to my dataframe?

# create a list and dataframe to hold the names of predictors
dataframe=dataframe.drop(['price','Date'],axis=1)  
predictorsList = dataframe.columns.get_values().tolist()
predictorsList = np.array(predictorsList, dtype=object)

# Combine actual and forecasted lists to one dataframe
combinedResults = pd.DataFrame({'Actual': actual, 'Forecasted': forecasted})

predictorsDF = pd.DataFrame({'Predictors': [predictorsList]})

# Add Predictors to dataframe
#combinedResults.at[0, 'Predictors'] = predictorsList
pd.concat([combinedResults,predictorsDF], ignore_index=True, axis=1)

711

asked Jul 06 '18 22:07

amadzebra

2 Answers

You could fill the rest of the cells in the desired column with NaN, but they will not "empty". To do that, use pd.merge on both indexes:

Setup

import pandas as pd
import numpy as np

df = pd.DataFrame({
     'Actual': [18.442, 15.4233, 20.6217, 16.7, 18.185], 
     'Forecasted': [19.6377, 13.1665, 19.3992, 17.4557, 14.0053]
})

arr = np.zeros(3)
df_arr = pd.DataFrame({'Predictors': [arr]})

Merging df and df_arr

result = pd.merge(
    df,
    df_arr,
    how='left',
    left_index=True, # Merge on both indexes, since right only has 0...
    right_index=True # all the other rows will be NaN
)

Results

>>> print(result)
    Actual  Forecasted       Predictors
0  18.4420     19.6377  [0.0, 0.0, 0.0]
1  15.4233     13.1665              NaN
2  20.6217     19.3992              NaN
3  16.7000     17.4557              NaN
4  18.1850     14.0053              NaN

>>> result.loc[0, 'Predictors']
array([0., 0., 0.])

>>> result.loc[1, 'Predictors'] # actually contains a NaN value
nan

132

answered Oct 05 '22 23:10

Tomas Farias

You need to change the object type of the column (in your case Predictors) first

import pandas as pd
import numpy as np


df=pd.DataFrame(np.arange(20).reshape(5,4), columns=list('abcd'))
df=df.astype(object)  # this line allows the signment of the array
df.iloc[1,2] = np.array([99,99,99])
print(df)

gives

    a   b             c   d
0   0   1             2   3
1   4   5  [99, 99, 99]   7
2   8   9            10  11
3  12  13            14  15
4  16  17            18  19

answered Oct 05 '22 22:10

Markus Dutschke

Related questions
                            
                                Implementing Tags using Django rest framework
                            
                                Importing matplotlib.pyplot fails in PyCharm due to AttributeError: module 'PyQt5.QtGui' has no attribute 'QApplication'
                            
                                Return Longest Path with nodes of same value
                            
                                extracting graph from printed ecg
                            
                                Jupyter Notebook Input Line Executed Before Print Statement
                            
                                How to link python 2.7 with latest openssl version in MAC OS?
                            
                                Using Scrapy on a Google cache of a website
                            
                                How to split training and test sets?
                            
                                Keras SimpleRNN confusion
                            
                                pip install pcapy cannot open include file 'pcap.h'
                            
                                No module named pathlib2
                            
                                convert EXR to JPEG using ImageIO and Python
                            
                                Python - Gmail API - Instance of 'Resource' has no 'users' member
                            
                                MemoryError when fitting scikit-learn Decision Tree and Random Forest Classifiers
                            
                                8-bit checksum is off by one
                            
                                Flask SQLAlchemy: How to add a column that depends on another table?
                            
                                Fill column of a dataframe from another dataframe
                            
                                "django-admin.py makemessages -l en" adds Plural-Forms to the output file
                            
                                Can't install pygame with pip for Python 3.7 - Command "python setup.py egg_info" failed with error code 1
                            
                                Retrieving facets and point from VTK file in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With