I have a dataframe that contains a column, let's call it "names". "names" has the name of other columns. I would like to add a new column that would have for each row the value based on the column name contained on that "names" column. Example: Input dataframe: <code>pd.DataFrame.from_dict({"a": [1, 2, 3,4], "b": [-1,-2,-3,-4], "names":['a','b','a','b']})</code> <pre class="prettyprint"> a | b | names | --- | --- | ---- | 1 | -1 | 'a' | 2 | -2 | 'b' | 3 | -3 | 'a' | 4 | -4 | 'b' | </pre> Output dataframe: <code>pd.DataFrame.from_dict({"a": [1, 2, 3,4], "b": [-1,-2,-3,-4], "names":['a','b','a','b'], "new_col":[1,-2,3,-4]})</code> <pre class="prettyprint"> a | b | names | new_col | --- | --- | ---- | ------ | 1 | -1 | 'a' | 1 | 2 | -2 | 'b' | -2 | 3 | -3 | 'a' | 3 | 4 | -4 | 'b' | -4 | </pre>

You can use <code>lookup</code>: <pre class="prettyprint"><code>df['new_col'] = df.lookup(df.index, df.names) df # a b names new_col #0 1 -1 a 1 #1 2 -2 b -2 #2 3 -3 a 3 #3 4 -4 b -4 </code></pre> <h3>EDIT</h3> <code>lookup</code> has been deprecated, here's the currently recommended solution: <pre class="prettyprint lang-py prettyprint-override"><code>idx, cols = pd.factorize(df['names']) df.reindex(cols, axis=1).to_numpy()[np.arange(len(df)), idx] </code></pre>

Pandas - select column using other column value as column name

Tags:

python

pandas

I have a dataframe that contains a column, let's call it "names". "names" has the name of other columns. I would like to add a new column that would have for each row the value based on the column name contained on that "names" column.

Example:

Input dataframe: pd.DataFrame.from_dict({"a": [1, 2, 3,4], "b": [-1,-2,-3,-4], "names":['a','b','a','b']})

  a  |  b  | names |
 --- | --- | ----  |
  1  |  -1 | 'a'   |
  2  |  -2 | 'b'   |
  3  |  -3 | 'a'   |
  4  |  -4 | 'b'   |

Output dataframe: pd.DataFrame.from_dict({"a": [1, 2, 3,4], "b": [-1,-2,-3,-4], "names":['a','b','a','b'], "new_col":[1,-2,3,-4]})

  a  |  b  | names | new_col | 
 --- | --- | ----  | ------  |
  1  |  -1 | 'a'   |    1    |
  2  |  -2 | 'b'   |   -2    |
  3  |  -3 | 'a'   |    3    |
  4  |  -4 | 'b'   |   -4    |

272

asked Aug 03 '17 14:08

ab3

1 Answers

You can use lookup:

df['new_col'] = df.lookup(df.index, df.names)
df
#   a    b  names   new_col
#0  1   -1      a   1
#1  2   -2      b   -2
#2  3   -3      a   3
#3  4   -4      b   -4

EDIT

lookup has been deprecated, here's the currently recommended solution:

idx, cols = pd.factorize(df['names'])
df.reindex(cols, axis=1).to_numpy()[np.arange(len(df)), idx]

184

answered Oct 20 '22 05:10

Psidom

Related questions
                            
                                Center crop a numpy array
                            
                                How to change values of url query in python?
                            
                                Multiple comparison operators in single statement (chaining comparison operators)
                            
                                How to import Bokeh palettes
                            
                                ImportError: libgomp.so.1: cannot open shared object file: No such file or directory
                            
                                Displaying both sides of a ManyToMany relationship in Django admin
                            
                                Can't pip install packages in python 3.6 due to ssl error
                            
                                Is there a method in numpy to multiply every element in an array?
                            
                                Speeding up an .exe created with Pyinstaller
                            
                                pandas to_latex() escapes mathmode
                            
                                Multiprocessing - map over list, killing processes that stall above timeout limit
                            
                                Python script runs on boot then reboots at end - How to regain control?
                            
                                OSError: Unable to locate Ghostscript on paths
                            
                                Select columns using pandas dataframe.query()
                            
                                Count number of columns with some values for each row in pandas
                            
                                Pandas Dataframe to Nested JSON
                            
                                __init__.py does not find modules in same directory [duplicate]
                            
                                Geometric Brownian Motion simulation in Python
                            
                                How can I get the base url with Flask / Jinja2?
                            
                                python convert memoryview to string

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With