Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas how to use groupby to group columns by date in the label?

I have a dataframe 10730 rows × 249 columns, i have columns:

Index(['RegionID', 'Metro', 'CountyName', 'SizeRank', '1996-04', '1996-05',
   '1996-06', '1996-07', '1996-08', '1996-09',
   ...
   '2015-11', '2015-12', '2016-01', '2016-02', '2016-03', '2016-04',
   '2016-05', '2016-06', '2016-07', '2016-08'],
  dtype='object', length=249)

so what i need to do is group the columns by the quarter, jan to march Q1, and so on till Q4(using mean for the values). i know how to group 3 columns for example, but how do i group all the columns since i cannot specify the name of the column one by one. This is the dataframe head in csv to use for testing:

'State,RegionName,RegionID,Metro,CountyName,SizeRank,1996-04,1996-05,1996-06,1996-07,1996-08,1996-09,1996-10,1996-11,1996-12,1997-01,1997-02,1997-03,1997-04,1997-05,1997-06,1997-07,1997-08,1997-09,1997-10,1997-11,1997-12,1998-01,1998-02,1998-03,1998-04,1998-05,1998-06,1998-07,1998-08,1998-09,1998-10,1998-11,1998-12,1999-01,1999-02,1999-03,1999-04,1999-05,1999-06,1999-07,1999-08,1999-09,1999-10,1999-11,1999-12,2000-01,2000-02,2000-03,2000-04,2000-05,2000-06,2000-07,2000-08,2000-09,2000-10,2000-11,2000-12,2001-01,2001-02,2001-03,2001-04,2001-05,2001-06,2001-07,2001-08,2001-09,2001-10,2001-11,2001-12,2002-01,2002-02,2002-03,2002-04,2002-05,2002-06,2002-07,2002-08,2002-09,2002-10,2002-11,2002-12,2003-01,2003-02,2003-03,2003-04,2003-05,2003-06,2003-07,2003-08,2003-09,2003-10,2003-11,2003-12,2004-01,2004-02,2004-03,2004-04,2004-05,2004-06,2004-07,2004-08,2004-09,2004-10,2004-11,2004-12,2005-01,2005-02,2005-03,2005-04,2005-05,2005-06,2005-07,2005-08,2005-09,2005-10,2005-11,2005-12,2006-01,2006-02,2006-03,2006-04,2006-05,2006-06,2006-07,2006-08,2006-09,2006-10,2006-11,2006-12,2007-01,2007-02,2007-03,2007-04,2007-05,2007-06,2007-07,2007-08,2007-09,2007-10,2007-11,2007-12,2008-01,2008-02,2008-03,2008-04,2008-05,2008-06,2008-07,2008-08,2008-09,2008-10,2008-11,2008-12,2009-01,2009-02,2009-03,2009-04,2009-05,2009-06,2009-07,2009-08,2009-09,2009-10,2009-11,2009-12,2010-01,2010-02,2010-03,2010-04,2010-05,2010-06,2010-07,2010-08,2010-09,2010-10,2010-11,2010-12,2011-01,2011-02,2011-03,2011-04,2011-05,2011-06,2011-07,2011-08,2011-09,2011-10,2011-11,2011-12,2012-01,2012-02,2012-03,2012-04,2012-05,2012-06,2012-07,2012-08,2012-09,2012-10,2012-11,2012-12,2013-01,2013-02,2013-03,2013-04,2013-05,2013-06,2013-07,2013-08,2013-09,2013-10,2013-11,2013-12,2014-01,2014-02,2014-03,2014-04,2014-05,2014-06,2014-07,2014-08,2014-09,2014-10,2014-11,2014-12,2015-01,2015-02,2015-03,2015-04,2015-05,2015-06,2015-07,2015-08,2015-09,2015-10,2015-11,2015-12,2016-01,2016-02,2016-03,2016-04,2016-05,2016-06,2016-07,2016-08\nNY,New York,6181,New York,Queens,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,432600.0,438700.0,440500.0,433900.0,422000.0,415700.0,421200.0,431100.0,435100.0,431900.0,428400.0,430700.0,438800.0,446800.0,455400.0,465500.0,472600.0,478200.0,487600.0,498600.0,508800.0,515300.0,517000.0,517800.0,520800.0,521500.0,523000.0,526300.0,524800.0,519100.0,516200.0,516400.0,516300.0,515500.0,512200.0,509200.0,509800.0,511600.0,512700.0,514000.0,513400.0,510700.0,508100.0,506700.0,505200.0,503700.0,502900.0,502400.0,500500.0,496400.0,491900.0,487500.0,484400.0,481700.0,477900.0,473600.0,469700.0,466100.0,461700.0,457700.0,455300.0,454800.0,456000.0,457800.0,461300.0,466100.0,470200.0,472800.0,475300.0,477100.0,478400.0,479100.0,478900.0,477700.0,476700.0,477100.0,478000.0,478000.0,476800.0,475300.0,473800.0,472000.0,470600.0,469900.0,469500.0,468200.0,465800.0,463500.0,461800.0,460100.0,459700.0,460800.0,461700.0,462500.0,463900.0,466000.0,467500.0,468200.0,468700.0,469400.0,469400,469100.0,468700,469300,470300,472100,474300,477600,481400,485100,488800,492600,495900,499500,503500,506400,509900,515700,520800,522200,522400,523800,526200,528400,529600,530800,532200,533800,536200,540600,545600,551400,557200,563000,568700,573600,576200,578400,582200,588000,592200,592500,590200,588000,586400\nCA,Los Angeles,12447,Los Angeles-Long Beach-Anaheim,Los Angeles,2,155000.0,154600.0,154400.0,154200.0,154100.0,154300.0,154300.0,154200.0,154800.0,155900.0,157000.0,157700.0,158200.0,158600.0,158800.0,158900.0,159100.0,159800.0,160700.0,161900.0,163400.0,165400.0,167000.0,168500.0,169900.0,171400.0,172900.0,174300.0,175800.0,177800.0,180100.0,182600.0,184400.0,185600.0,186900.0,188200.0,189600.0,191300.0,193100.0,194700.0,196300.0,197700.0,199100.0,200700.0,202300.0,204400.0,207000.0,209800.0,212300.0,214500.0,216600.0,219000.0,221100.0,222800.0,224300.0,226100.0,228100.0,230600.0,233000.0,235400.0,237300.0,239100.0,240900.0,242900.0,245000.0,247300.0,250100.0,253100.0,255900.0,258800.0,261900.0,265200.0,268600.0,272600.0,276900.0,281800.0,287000.0,292200.0,297000.0,302100.0,307600.0,313400.0,319000.0,324300.0,329600.0,334600.0,339300.0,344500.0,350600.0,356800.0,363400.0,370700.0,378400.0,386500.0,394900.0,404300.0,414600.0,425500.0,436600.0,447400.0,456700.0,464400.0,471200.0,477400.0,483500.0,489100.0,494700.0,501400.0,509700.0,518300.0,527200.0,536100.0,545400.0,555200.0,564500.0,571900.0,576800.0,579700.0,581800.0,583800.0,585300.0,587300.0,589900.0,592200.0,593300.0,593400.0,593100.0,592900.0,591600.0,590900.0,591800.0,592600.0,592100.0,590200.0,586200.0,581600.0,577500.0,572800.0,567600.0,562100.0,554400.0,545000.0,535500.0,525400.0,513600.0,502000.0,491200.0,480200.0,469000.0,459300.0,451200.0,443900.0,436800.0,430900.0,426100.0,421800.0,417800.0,413700.0,410200.0,407900.0,406300.0,404900.0,404200.0,402900.0,405900.0,412000.0,415000.0,413100.0,412100.0,411300.0,410100.0,408400.0,406800.0,405100.0,403300.0,401900.0,401000.0,399200.0,397100.0,395000.0,392700.0,390200.0,387400.0,384700.0,382100.0,379500.0,377200.0,375700.0,373800.0,371500.0,370000.0,370300.0,372100.0,375300.0,378600.0,382100.0,385600.0,389000.0,391800.0,396400.0,401500,405700.0,410700,418200,425500,432700,440400,448100,455200,461900,467800,472300,475700,479400,484000,489400,494200,498100,501800,505600,509000,512600,516000,518900,521700,525100,528900,532400,535300,538200,541000,544000,547200,550600,554200,558200,560800,562800,565600,569700,574000,577800,580600,583000,585100\nIL,Chicago,17426,Chicago,Cook,3,109700.0,109400.0,109300.0,109300.0,109100.0,109000.0,109000.0,109600.0,110200.0,110800.0,111300.0,111700.0,112200.0,112300.0,112100.0,112200.0,113000.0,113700.0,114200.0,114800.0,115500.0,116200.0,117100.0,117600.0,117800.0,118300.0,119200.0,120000.0,120600.0,121500.0,122300.0,122700.0,122900.0,123300.0,123700.0,124500.0,125700.0,127300.0,128800.0,130200.0,131400.0,132600.0,133700.0,134600.0,135500.0,136800.0,138300.0,140100.0,141900.0,143700.0,145300.0,146700.0,147900.0,149000.0,150400.0,152000.0,154000.0,155600.0,157000.0,158200.0,159900.0,161800.0,163700.0,165300.0,166400.0,167500.0,168800.0,170400.0,172100.0,173900.0,175600.0,177000.0,177800.0,177600.0,177300.0,177700.0,178800.0,180400.0,182300.0,183800.0,185000.0,185600.0,186800.0,188900.0,191300.0,194100.0,197500.0,200200.0,202300.0,203700.0,204000.0,204000.0,204400.0,205300.0,206300.0,207000.0,207600.0,208600.0,209600.0,210900.0,212800.0,214600.0,216400.0,218300.0,220300.0,222300.0,224000.0,225400.0,226900.0,228600.0,230100.0,231800.0,233200.0,234500.0,236000.0,237500.0,239000.0,240800.0,242500.0,243900.0,244900.0,245300.0,245400.0,245800.0,245800.0,245500.0,245900.0,246900.0,247300.0,247400.0,247300.0,247000.0,246700.0,246400.0,246100.0,246100.0,246300.0,246400.0,246700.0,247100.0,246700.0,245300.0,243900.0,242000.0,239800.0,237900.0,236000.0,233500.0,231800.0,230700.0,229200.0,226700.0,225200.0,224500.0,223800.0,223000.0,221900.0,219700.0,217500.0,215600.0,213800.0,212900.0,212300.0,211900.0,210800.0,209300.0,207300.0,205300.0,204200.0,204100.0,203100.0,201100.0,199000.0,196700.0,193800.0,191100.0,189200.0,188100.0,187600.0,186500.0,184400.0,181700.0,178700.0,175900.0,174100.0,172800.0,171400.0,170100.0,169100.0,167900.0,166700.0,166200.0,166400.0,166800.0,167900.0,168900.0,168400.0,167100.0,166900.0,167300.0,167500,167700.0,168300,169100,170400,172400,175100,178200,181000,183200,184600,185800,187200,189100,191100,192500,192600,192400,192900,193900,195600,197800,200100,201700,202000,201200,200500,201500,204000,206500,207600,207700,208100,209100,209000,207800,206900,206200,205800,206200,207300,208200,209100,211000,213000\nPA,Philadelphia,13271,Philadelphia,Philadelphia,4,50000.0,49900.0,49600.0,49400.0,49400.0,49300.0,49300.0,49400.0,49700.0,49600.0,49500.0,49700.0,49800.0,49700.0,49700.0,49800.0,49700.0,49700.0,49800.0,49900.0,49900.0,50000.0,50300.0,50600.0,50800.0,50800.0,50800.0,50800.0,50700.0,50500.0,50500.0,50700.0,50700.0,50800.0,50900.0,51100.0,51200.0,51400.0,51500.0,51400.0,51500.0,51800.0,52100.0,52100.0,52300.0,52700.0,53100.0,53200.0,53400.0,53700.0,53800.0,53800.0,54100.0,54500.0,54700.0,54600.0,54800.0,55100.0,55400.0,55500.0,55400.0,55500.0,55700.0,55900.0,56300.0,56600.0,57000.0,57500.0,58100.0,58600.0,59100.0,59700.0,60300.0,60700.0,61200.0,61800.0,62200.0,62500.0,63000.0,63600.0,63900.0,64200.0,64700.0,65300.0,65700.0,66100.0,66800.0,67700.0,68500.0,69200.0,69800.0,70700.0,71700.0,72800.0,73700.0,74700.0,75700.0,76700.0,77800.0,79100.0,80500.0,82100.0,84000.0,85600.0,87000.0,88200.0,89600.0,91300.0,93000.0,94900.0,96700.0,98400.0,100200.0,101900.0,103400.0,104900.0,106400.0,107500.0,108200.0,109300.0,110800.0,112500.0,113800.0,114800.0,115600.0,116000.0,116400.0,116700.0,116800.0,116900.0,117300.0,117800.0,118200.0,118600.0,119300.0,120200.0,120900.0,121400.0,121300.0,120900.0,120200.0,119600.0,119600.0,119500.0,118800.0,118100.0,117500.0,117100.0,117000.0,116700.0,116300.0,115800.0,115500.0,115900.0,116300.0,116400.0,116400.0,116100.0,116000.0,116200.0,116700.0,117300.0,118000.0,118200.0,119500.0,120900.0,121300.0,121300.0,122100.0,123000.0,123300.0,122300.0,120000.0,118200.0,117600.0,117900.0,117800.0,117400.0,117000.0,116900.0,116700.0,116500.0,115700.0,115300.0,115500.0,115600.0,115200.0,114800.0,114100.0,113500.0,112900.0,111800.0,110800.0,110400.0,110400.0,110200.0,109900.0,109700.0,110000.0,110700.0,111800,112100.0,111900,112000,112200,111800,111200,111000,110900,111100,111800,112700,112900,113100,113900,114200,113600,113500,114100,114900,115500,115500,115400,115600,116000,116100,116100,116400,117000,117900,119000,120100,121300,122300,122700,122300,121600,121800,123300,125200,126400,127000,127400,128300,129100\nAZ,Phoenix,40326,Phoenix,Maricopa,5,87200.0,87700.0,88200.0,88400.0,88500.0,88900.0,89400.0,89700.0,90100.0,90700.0,91400.0,91700.0,91800.0,92000.0,92300.0,92600.0,93000.0,93400.0,94000.0,94600.0,95300.0,96100.0,96800.0,97300.0,97700.0,98400.0,99200.0,100100.0,100500.0,100700.0,100900.0,101700.0,102600.0,103400.0,103900.0,104400.0,105100.0,105900.0,106200.0,106600.0,107400.0,108300.0,109000.0,109700.0,110400.0,111000.0,111700.0,112800.0,113700.0,114300.0,115100.0,115600.0,115900.0,116500.0,117200.0,117400.0,117600.0,118400.0,119700.0,120700.0,121200.0,121500.0,122000.0,122400.0,122700.0,123000.0,123600.0,124300.0,125000.0,125800.0,126600.0,127200.0,127900.0,128400.0,128800.0,129500.0,130500.0,131600.0,132500.0,133200.0,134000.0,134900.0,135700.0,136500.0,137200.0,138000.0,138600.0,138900.0,139200.0,139400.0,139600.0,140300.0,141400.0,142500.0,143700.0,144900.0,145900.0,147100.0,148400.0,150300.0,153100.0,156200.0,159400.0,162900.0,166500.0,170000.0,173900.0,178800.0,185000.0,192300.0,200700.0,209400.0,217000.0,223600.0,229800.0,234900.0,238600.0,241300.0,243000.0,244100.0,244800.0,245400.0,245600.0,245600.0,245300.0,244600.0,243800.0,243400.0,243400.0,243600.0,243200.0,242200.0,241300.0,240200.0,238400.0,236400.0,234700.0,233300.0,231600.0,229100.0,226100.0,222800.0,218800.0,214300.0,209500.0,205200.0,201100.0,197300.0,193700.0,190300.0,186700.0,182800.0,180500.0,179600.0,178000.0,175100.0,172100.0,168400.0,164200.0,160000.0,156000.0,151800.0,147600.0,143900.0,138900.0,133400.0,130200.0,129200.0,127700.0,126200.0,124800.0,123100.0,120700.0,118500.0,117000.0,115800.0,114800.0,114100.0,113200.0,111800.0,110100.0,108000.0,105900.0,104100.0,102900.0,102300.0,102400.0,103000.0,104100.0,105800.0,107600.0,109100.0,111200.0,114000.0,117200.0,120400.0,123300.0,125800.0,128300.0,130500.0,132500,134400.0,136200,138400,141600,144700,147400,150500,153600,156100,158100,160000,161600,162700,163300,163700,164100,164200,164500,164700,165200,166200,167200,168400,169900,171000,171500,172100,172900,174100,175500,177100,179100,181000,182400,183800,185300,186600,188000,189100,190200,191300,192800,194500,195900\n'

I changed the column index to date by dropping the non dates from the df quarter = df.drop(['RegionID','Metro','CountyName','SizeRank'],axis=1) then change the columns to date quarter.columns = pd.to_datetime(quarter.columns) then i would like to do something likequarter = quarter.groupby(pd.TimeGrouper(freq='3M'),axis=1) but it's not working, then i would merge it back to the non-date columns. Also with this approach i wouldnt know how to put the right label for it like [2015Q4,2016Q1,2016Q2,2016Q3,2016Q4]

like image 550
lucarlig Avatar asked Nov 05 '16 00:11

lucarlig


People also ask

Can you use groupby with multiple columns in pandas?

groupby() can take the list of columns to group by multiple columns and use the aggregate functions to apply single or multiple aggregations at the same time.

How do I Group column values in pandas?

groupby() and pass the name of the column that you want to group on, which is "state" . Then, you use ["last_name"] to specify the columns on which you want to perform the actual aggregation. You can pass a lot more than just a single column name to . groupby() as the first argument.

How do you group a pandas DataFrame by multiple columns in Python?

Grouping DataFrame with Index Levels and Columns A DataFrame may be grouped by a combination of columns and index levels by specifying the column names as strings and the index levels as pd. Grouper objects. The following example groups df by the second index level and the A column.


1 Answers

Here is a vectorized solution which uses pd.PeriodIndex and groupby(..., axis=1):

Data:

In [69]: x
Out[69]:
   2016-01  2016-02  2016-03  2016-04  2016-05  2016-06
0        1        0        1        0        0        0
1        2        0        1        0        0        0
2        1        1        2        0        1        0

Solution:

In [70]: x.groupby(pd.PeriodIndex(x.columns, freq='Q'), axis=1).mean()
Out[70]:
     2016Q1    2016Q2
0  0.666667  0.000000
1  1.000000  0.000000
2  1.333333  0.333333

Explanation:

In [71]: pd.PeriodIndex(x.columns, freq='Q')
Out[71]: PeriodIndex(['2016Q1', '2016Q1', '2016Q1', '2016Q2', '2016Q2', '2016Q2'], dtype='period[Q-DEC]', freq='Q-DEC')
like image 128
MaxU - stop WAR against UA Avatar answered Sep 22 '22 11:09

MaxU - stop WAR against UA