Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I assign iterables to columns in pandas dataframes?

I have a dataframe containing rows which describe financial stocks. The following is a simplified version:

df = pd.DataFrame(
    {
        "stockprice": [100, 103, 240],
        "Characteristic1": [1, 3, 3],
        "Characteristic2": [5, 7, 1],
        "Characteristic3": [1, 4, 6],
    },
    index=["Company A", "Company B", "Company C"],
)

#            stockprice  Characteristic1  Characteristic2  Characteristic3
# Company A  100         1                5                1              
# Company B  103         3                7                4              
# Company C  240         3                1                6            

I would like to add a column which should contain for each cell a long dictionary which will be generated based on some of these characteristics - a series of cashflows. Later I will want to do some calculation on this generated dictionary.

Here is a sample function which generates the dictionary, and then the assign function to put it into my dataframe:

def cashflow_series(ch1=1, ch2=2):
    return {0: ch1, 0.5: ch2, 1: 7, 2: 8, 3: 9}


df.assign(
    cashflows=lambda x: cashflow_series(
        ch1=x["Characteristic1"], ch2=x["Characteristic3"]
    )
)

This returns

           stockprice  Characteristic1  Characteristic2  Characteristic3 cashflows
Company A  100         1                5                1                NaN     
Company B  103         3                7                4                NaN     
Company C  240         3                1                6                NaN

How can I fix this?

I want the new column 'cashflows' to contain a dictionary for each row, not a NaN.

I want something like this:

           stockprice  Characteristic1  Characteristic2  Characteristic3 cashflows
Company A  100         1                5                1                {0:1,..3:9}
Company B  103         3                7                4                {0:3,..3:9}
Company C  240         3                1                6                {0:3,..3:9}
like image 313
Tim Avatar asked Oct 14 '25 03:10

Tim


2 Answers

You'll need to use a list comprehension:

def cashflow_series(ch1=1,ch2=2):
    return {0:ch1,0.5:ch2,1:7,2:8,3:9}

df.assign(cashflows=lambda x: [cashflow_series(ch1=x.loc[i, 'Characteristic1'],
                                               ch2=x.loc[i, 'Characteristic3'])
                               for i in x.index])

Or with parameter unpacking:

df.assign(cashflows=lambda x: [cashflow_series(*params)
                               for params in
                               df[['Characteristic1', 'Characteristic2']].values])

Output:

           stockprice  Characteristic1  Characteristic2  Characteristic3                         cashflows
Company A         100                1                5                1  {0: 1, 0.5: 1, 1: 7, 2: 8, 3: 9}
Company B         103                3                7                4  {0: 3, 0.5: 4, 1: 7, 2: 8, 3: 9}
Company C         240                3                1                6  {0: 3, 0.5: 6, 1: 7, 2: 8, 3: 9}
like image 100
mozway Avatar answered Oct 16 '25 16:10

mozway


Your code is working fine. Just need simple modification.

def cashflow_series(ch1=1, ch2=2):
    return {0: ch1, 0.5: ch2, 1: 7, 2: 8, 3: 9}


df = df.assign(
    cashflows=lambda x: x.apply(
        lambda row: cashflow_series(
            ch1=row["Characteristic1"], ch2=row["Characteristic3"]
        ),
        axis=1,
    )
)

print(df.to_string())
like image 24
Muideen Avatar answered Oct 16 '25 17:10

Muideen