How to add an empty column to a dataframe?
This is partially covered already.
The dtype of df["D"] = np.nan
in the accepted answer is dtype=numpy.float64
.
Is there a way to initialize an empty list into each cell?
Tried df["D"] = [[]] * len(df)
but all values are pointing to the same object and setting one to a value sets them all.
df = pd.DataFrame({"A": [1,2,3], "B": [2,3,4]})
df
A B
0 1 2
1 2 3
2 3 4
df["D"] = [[]] * len(df)
df
A B D
0 1 2 []
1 2 3 []
2 3 4 []
df['D'][1].append(['a','b','c','d'])
df
A B D
0 1 2 [[a, b, c, d]]
1 2 3 [[a, b, c, d]]
2 3 4 [[a, b, c, d]]
wanted
A B D
0 1 2 []
1 2 3 [[a, b, c, d]]
2 3 4 []
Use
df["D"] = [[] for _ in range(len(df))]
instead of
df["D"] = [[]] * len(df)
This way you'll create a different []
for each row.
Basically [[] for _ in range(len(df))]
is a list comprehension. It creates a []
for each value in range(len(df))
.
This code has the same functionality as
l = []
for _ in range(len(df)):
l.append([])
But is notably faster, simpler to write and even more readable.
If you want to know further on list comprehensions , I'd recommend the answers for this question.
If you want to know further on why that behavior happens when doing [[]] * len(df)
, I'd recommend the answers for this question
Could you not just pass in a list of lists when creating the column. Then assign the list value to a temporary variable, next assign that list to one field in the data frame using loc
import pandas as pd
df = pd.DataFrame()
df['col A'] = [1,12,312,352]
df['col B'] = [[],[],[],[]]
ser = [1,4,5,6]
df.loc[2,'col B'] = ser
df
Output:
Does this help? Is this what you are looking for?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With