Combining rows in pandas [duplicate]

Tags:

I have a DataFrame with an index called city_id of cities in the format [city],[state] (e.g., new york,ny containing integer counts in the columns. The problem is that I have multiple rows for the same city, and I want to collapse the rows sharing a city_id by adding their column values. I looked at groupby() but it wasn't immediately obvious how to apply it to this problem.

Edit:

An example: I'd like to change this:

Click to copy

city_id    val1 val2 val3 houston,tx    1    2    0 houston,tx    0    0    1 houston,tx    2    1    1

into this:

Click to copy

city_id    val1 val2 val3 houston,tx    3    3    2

if there are ~10-20k rows.

773

asked Jul 03 '13 02:07

lightlike

2 Answers

Starting from

Click to copy

>>> df               val1  val2  val3 city_id                        houston,tx       1     2     0 houston,tx       0     0     1 houston,tx       2     1     1 somewhere,ew     4     3     7

I might do

Click to copy

>>> df.groupby(df.index).sum()               val1  val2  val3 city_id                        houston,tx       3     3     2 somewhere,ew     4     3     7

Click to copy

>>> df.reset_index().groupby("city_id").sum()               val1  val2  val3 city_id                        houston,tx       3     3     2 somewhere,ew     4     3     7

The first approach passes the index values (in this case, the city_id values) to groupby and tells it to use those as the group keys, and the second resets the index and then selects the city_id column. See this section of the docs for more examples. Note that there are lots of other methods in the DataFrameGroupBy objects, too:

Click to copy

>>> df.groupby(df.index) <pandas.core.groupby.DataFrameGroupBy object at 0x1045a1790> >>> df.groupby(df.index).max()               val1  val2  val3 city_id                        houston,tx       2     2     1 somewhere,ew     4     3     7 >>> df.groupby(df.index).mean()               val1  val2      val3 city_id                            houston,tx       1     1  0.666667 somewhere,ew     4     3  7.000000

162

answered Oct 02 '22 11:10

DSM

Something in the same line. Sorry not the exact replica.

Click to copy

mydata = [{'subid' : 'B14-111', 'age': 75, 'fdg':1.78},           {'subid' : 'B14-112', 'age': 22, 'fdg':1.56},{'subid' : 'B14-112', 'age': 40, 'fdg':2.00},] df = pandas.DataFrame(mydata)  gg = df.groupby("subid",sort=True).sum()

answered Oct 02 '22 10:10

LonelySoul

Related questions
                            
                                Python p-value from t-statistic
                            
                                Scikit-learn, get accuracy scores for each class
                            
                                Find longest repetitive sequence in a string
                            
                                Docstrings when nothing is returned
                            
                                TensorFlow: How and why to use SavedModel
                            
                                Reading serial data in realtime in Python
                            
                                Python library for playing fixed-frequency sound
                            
                                Format truncated Python float as int in string
                            
                                Scikit Learn TfidfVectorizer : How to get top n terms with highest tf-idf score
                            
                                Python Dictionary contains List as Value - How to update?
                            
                                500 Error without anything in the apache logs
                            
                                Python Threading with Event object
                            
                                What's the difference between io.open() and os.open() on Python?
                            
                                What's the difference between nan, NaN and NAN
                            
                                How to read an image in Python OpenCV
                            
                                Data type conversion error: ValueError: Cannot convert non-finite values (NA or inf) to integer [duplicate]
                            
                                What is a python thread
                            
                                Change timezone of date-time column in pandas and add as hierarchical index
                            
                                Reading data from S3 using Lambda
                            
                                How to change a single value in a NumPy array?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Combining rows in pandas [duplicate]

Tags:

python

pandas

lightlike

People also ask

2 Answers

DSM

LonelySoul

Recent Activity

Donate For Us