pandas for python is neat. I'm trying to replace a list-of-dictionaries with a pandas-dataframe. However, I'm wondering of there's a way to change values row-by-row in a for-loop just as easy? Here's the non-pandas dict-version: <pre class="prettyprint"><code>trialList = [ {'no':1, 'condition':2, 'response':''}, {'no':2, 'condition':1, 'response':''}, {'no':3, 'condition':1, 'response':''} ] # ... and so on for trial in trialList: # Do something and collect response trial['response'] = 'the answer!' </code></pre> ... and now <code>trialList</code> contains the updated values because <code>trial</code> refers back to that. Very handy! But the list-of-dicts is very unhandy, especially because I'd like to be able to compute stuff column-wise which pandas excel at. So given trialList from above, I though I could make it even better by doing something pandas-like: <pre class="prettyprint"><code>import pandas as pd dfTrials = pd.DataFrame(trialList) # makes a nice 3-column dataframe with 3 rows for trial in dfTrials.iterrows(): # do something and collect response trials[1]['response'] = 'the answer!' </code></pre> ... but <code>trialList</code> remains unchanged here. Is there an easy way to update values row-by-row, perhaps equivalent to the dict-version? It is important that it's row-by-row as this is for an experiment where participants are presented with a lot of trials and various data is collected on each single trial.

If you really want row-by-row ops, you could use <code>iterrows</code> and <code>loc</code>: <pre class="prettyprint"><code>>>> for i, trial in dfTrials.iterrows(): ... dfTrials.loc[i, "response"] = "answer {}".format(trial["no"]) ... >>> dfTrials condition no response 0 2 1 answer 1 1 1 2 answer 2 2 1 3 answer 3 [3 rows x 3 columns] </code></pre> Better though is when you can vectorize: <pre class="prettyprint"><code>>>> dfTrials["response 2"] = dfTrials["condition"] + dfTrials["no"] >>> dfTrials condition no response response 2 0 2 1 answer 1 3 1 1 2 answer 2 3 2 1 3 answer 3 4 [3 rows x 4 columns] </code></pre> And there's always <code>apply</code>: <pre class="prettyprint"><code>>>> def f(row): ... return "c{}n{}".format(row["condition"], row["no"]) ... >>> dfTrials["r3"] = dfTrials.apply(f, axis=1) >>> dfTrials condition no response response 2 r3 0 2 1 answer 1 3 c2n1 1 1 2 answer 2 3 c1n2 2 1 3 answer 3 4 c1n3 [3 rows x 5 columns] </code></pre>

Edit pandas dataframe row-by-row

Tags:

pandas for python is neat. I'm trying to replace a list-of-dictionaries with a pandas-dataframe. However, I'm wondering of there's a way to change values row-by-row in a for-loop just as easy?

Here's the non-pandas dict-version:

trialList = [
    {'no':1, 'condition':2, 'response':''},
    {'no':2, 'condition':1, 'response':''},
    {'no':3, 'condition':1, 'response':''}
]  # ... and so on

for trial in trialList:
    # Do something and collect response
    trial['response'] = 'the answer!'

... and now trialList contains the updated values because trial refers back to that. Very handy! But the list-of-dicts is very unhandy, especially because I'd like to be able to compute stuff column-wise which pandas excel at.

So given trialList from above, I though I could make it even better by doing something pandas-like:

import pandas as pd    
dfTrials = pd.DataFrame(trialList)  # makes a nice 3-column dataframe with 3 rows

for trial in dfTrials.iterrows():
   # do something and collect response
   trials[1]['response'] = 'the answer!'

... but trialList remains unchanged here. Is there an easy way to update values row-by-row, perhaps equivalent to the dict-version? It is important that it's row-by-row as this is for an experiment where participants are presented with a lot of trials and various data is collected on each single trial.

649

asked Dec 19 '13 21:12

Jonas Lindeløv

1 Answers

If you really want row-by-row ops, you could use iterrows and loc:

>>> for i, trial in dfTrials.iterrows():
...     dfTrials.loc[i, "response"] = "answer {}".format(trial["no"])
...     
>>> dfTrials
   condition  no  response
0          2   1  answer 1
1          1   2  answer 2
2          1   3  answer 3

[3 rows x 3 columns]

Better though is when you can vectorize:

>>> dfTrials["response 2"] = dfTrials["condition"] + dfTrials["no"]
>>> dfTrials
   condition  no  response  response 2
0          2   1  answer 1           3
1          1   2  answer 2           3
2          1   3  answer 3           4

[3 rows x 4 columns]

And there's always apply:

>>> def f(row):
...     return "c{}n{}".format(row["condition"], row["no"])
... 
>>> dfTrials["r3"] = dfTrials.apply(f, axis=1)
>>> dfTrials
   condition  no  response  response 2    r3
0          2   1  answer 1           3  c2n1
1          1   2  answer 2           3  c1n2
2          1   3  answer 3           4  c1n3

[3 rows x 5 columns]

173

answered Sep 24 '22 08:09

DSM

Related questions
                            
                                @Autowired return exception on UserDetails in Spring-security
                            
                                psutil virtual memory units of measurement?
                            
                                Dealing with System.DBNull in PowerShell
                            
                                The Fragment element contains an unhandled extension element 'util:RegistrySearch'
                            
                                Wait for Download to finish in selenium webdriver JAVA
                            
                                How to read from files with Files.lines(...).forEach(...)?
                            
                                Host HTML file with ngrok
                            
                                Passing parameters to the base class constructor
                            
                                Issue with netcat timeout
                            
                                Parse: Include nested pointers in query
                            
                                Regex for existence of some words whose order doesn't matter
                            
                                Specified key was too long; max key length is 767 bytes Mysql error in Entity Framework 6

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Edit pandas dataframe row-by-row

Tags:

Jonas Lindeløv

People also ask

1 Answers

DSM

Recent Activity

Donate For Us