To add anew column with constant value, use the square bracket i.e. the index operator and set that value.
Code 1: Find duplicate columns in a DataFrame. To find duplicate columns we need to iterate through all columns of a DataFrame and for each and every column it will search if any other column exists in DataFrame with the same contents already. If yes then that column name will be stored in the duplicate column set.
df['new'] = 0
For in-place modification, perform direct assignment. This assignment is broadcasted by pandas for each row.
df = pd.DataFrame('x', index=range(4), columns=list('ABC'))
df
A B C
0 x x x
1 x x x
2 x x x
3 x x x
df['new'] = 'y'
# Same as,
# df.loc[:, 'new'] = 'y'
df
A B C new
0 x x x y
1 x x x y
2 x x x y
3 x x x y
If you want to add an column of empty lists, here is my advice:
object
columns are bad news in terms of performance. Rethink how your data is structured. If you must store a column of lists, ensure not to copy the same reference multiple times.
# Wrong
df['new'] = [[]] * len(df)
# Right
df['new'] = [[] for _ in range(len(df))]
df.assign(new=0)
If you need a copy instead, use DataFrame.assign
:
df.assign(new='y')
A B C new
0 x x x y
1 x x x y
2 x x x y
3 x x x y
And, if you need to assign multiple such columns with the same value, this is as simple as,
c = ['new1', 'new2', ...]
df.assign(**dict.fromkeys(c, 'y'))
A B C new1 new2
0 x x x y y
1 x x x y y
2 x x x y y
3 x x x y y
Finally, if you need to assign multiple columns with different values, you can use assign
with a dictionary.
c = {'new1': 'w', 'new2': 'y', 'new3': 'z'}
df.assign(**c)
A B C new1 new2 new3
0 x x x w y z
1 x x x w y z
2 x x x w y z
3 x x x w y z
With modern pandas you can just do:
df['new'] = 0
The reason this puts NaN
into a column is because df.index
and the Index
of your right-hand-side object are different. @zach shows the proper way to assign a new column of zeros. In general, pandas
tries to do as much alignment of indices as possible. One downside is that when indices are not aligned you get NaN
wherever they aren't aligned. Play around with the reindex
and align
methods to gain some intuition for alignment works with objects that have partially, totally, and not-aligned-all aligned indices. For example here's how DataFrame.align()
works with partially aligned indices:
In [7]: from pandas import DataFrame
In [8]: from numpy.random import randint
In [9]: df = DataFrame({'a': randint(3, size=10)})
In [10]:
In [10]: df
Out[10]:
a
0 0
1 2
2 0
3 1
4 0
5 0
6 0
7 0
8 0
9 0
In [11]: s = df.a[:5]
In [12]: dfa, sa = df.align(s, axis=0)
In [13]: dfa
Out[13]:
a
0 0
1 2
2 0
3 1
4 0
5 0
6 0
7 0
8 0
9 0
In [14]: sa
Out[14]:
0 0
1 2
2 0
3 1
4 0
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
Name: a, dtype: float64
Here is another one liner using lambdas (create column with constant value = 10)
df['newCol'] = df.apply(lambda x: 10, axis=1)
before
df
A B C
1 1.764052 0.400157 0.978738
2 2.240893 1.867558 -0.977278
3 0.950088 -0.151357 -0.103219
after
df
A B C newCol
1 1.764052 0.400157 0.978738 10
2 2.240893 1.867558 -0.977278 10
3 0.950088 -0.151357 -0.103219 10
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With