I recently discovered pandas "assign" method which I find very elegant. My issue is that the name of the new column is assigned as keyword, so it cannot have spaces or dashes in it.
df = DataFrame({'A': range(1, 11), 'B': np.random.randn(10)})
df.assign(ln_A = lambda x: np.log(x.A))
A B ln_A
0 1 0.426905 0.000000
1 2 -0.780949 0.693147
2 3 -0.418711 1.098612
3 4 -0.269708 1.386294
4 5 -0.274002 1.609438
5 6 -0.500792 1.791759
6 7 1.649697 1.945910
7 8 -1.495604 2.079442
8 9 0.549296 2.197225
9 10 -0.758542 2.302585
but what if I want to name the new column "ln(A)" for example? E.g.
df.assign(ln(A) = lambda x: np.log(x.A))
df.assign("ln(A)" = lambda x: np.log(x.A))
File "<ipython-input-7-de0da86dce68>", line 1
df.assign(ln(A) = lambda x: np.log(x.A))
SyntaxError: keyword can't be an expression
I know I could rename the column right after the .assign call, but I want to understand more about this method and its syntax.
To set column names of DataFrame in Pandas, use pandas. DataFrame. columns attribute. Assign required column names as a list to this attribute.
DataFrame - assign() function The assign() function is used to assign new columns to a DataFrame. Returns a new object with all original columns in addition to new ones. Existing columns that are re-assigned will be overwritten. The column names are keywords.
The syntax for the assign method is fairly simple. You type the name of your dataframe, then a “dot”, and then type assign() . Remember, the assign method is a Python method that's associated with dataframe objects, so we can use so-called “dot syntax” to call the method.
You can pass the keyword arguments to assign
as a dictionary, like so:
kwargs = {"ln(A)" : lambda x: np.log(x.A)}
df.assign(**kwargs)
A B ln(A)
0 1 0.500033 0.000000
1 2 -0.392229 0.693147
2 3 0.385512 1.098612
3 4 -0.029816 1.386294
4 5 -2.386748 1.609438
5 6 -1.828487 1.791759
6 7 0.096117 1.945910
7 8 -2.867469 2.079442
8 9 -0.731787 2.197225
9 10 -0.686110 2.302585
assign
expects a bunch of key word arguments. It will, in turn, assign columns with the names of the key words. That's handy, but you can't pass an expression as the key word. This is spelled out by @EdChum in the comments with this link
use insert
instead for inplace transformation
df.insert(2, 'ln(A)', np.log(df.A))
df
use concat
if you don't want inplace
pd.concat([df, np.log(df.A).rename('log(A)')], axis=1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With