Let's say I have a dataframe df
and I would like to create a new column filled with 0, I use:
df['new_col'] = 0
This far, no problem. But if the value I want to use is a list, it doesn't work:
df['new_col'] = my_list ValueError: Length of values does not match length of index
I understand why this doesn't work (pandas is trying to assign one value of the list per cell of the column), but how can we avoid this behavior? (if it isn't clear I would like every cell of my new column to contain the same predefined list)
Note: I also tried: df.assign(new_col = my_list)
, same problem
You'd have to do:
df['new_col'] = [my_list] * len(df)
Example:
In [13]: df = pd.DataFrame(np.random.randn(5,3), columns=list('abc')) df Out[13]: a b c 0 -0.010414 1.859791 0.184692 1 -0.818050 -0.287306 -1.390080 2 -0.054434 0.106212 1.542137 3 -0.226433 0.390355 0.437592 4 -0.204653 -2.388690 0.106218 In [17]: df['b'] = [[234]] * len(df) df Out[17]: a b c 0 -0.010414 [234] 0.184692 1 -0.818050 [234] -1.390080 2 -0.054434 [234] 1.542137 3 -0.226433 [234] 0.437592 4 -0.204653 [234] 0.106218
Note that dfs are optimised for scalar values, storing non scalar values defeats the point in my opinion as filtering, looking up, getting and setting become problematic to the point that it becomes a pain
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With