Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pivot Table-like Output in R?

I am writing a report that requires the generation of a number of pivot tables in Excel. I would like to think there is a way to do this in R so that I can avoid Excel. I would like output like the screenshot below (teacher names redacted). As far as I can tell, I could use the reshape package to calculate the aggregate values, but I'd need to do that a number of times and somehow get all of the data in the correct order. At that point, I should just be doing it in Excel. Does anyone have any suggestions or package recommendations? Thank you!

(EDIT) The data starts as a list of students, their teacher, school, and growth. This data is then aggregated to get a list of teachers with their average class growth. Please note the teachers are then grouped by school. The largest problem I foresee doing this with R as of now is how do you get the subtotal and total rows (BSA1 Total, Grand Total, etc) in there since they are not the same type of observation as the others? Do you just manually have to calculate them and try to get them in the correct order so they appear at the bottom of that group?

example
(source: imgh.us)

like image 945
Jeff Erickson Avatar asked Jul 12 '11 15:07

Jeff Erickson


People also ask

Can you make pivot tables in R?

Pivot tables are constructed natively in R, either via a short one line command to build a basic pivot table or via series of R commands that gradually build a more bespoke pivot table to meet your needs.

How do I export a pivot table in R?

Basic Export (no styling) Exporting a pivot table to an Excel table is reasonably straightforward: Create a pivot table in R using pivottabler , Using the openxlsx package, create a new Excel file and add a worksheet (or open an existing worksheet), Call the writeToExcelWorksheet method on the pivot table.

Can I convert existing pivot table to data model?

Add existing, unrelated data to a Data Model It can be any range of data, but data formatted as an Excel table is best. Use one of these approaches to add your data: Click Power Pivot > Add to Data Model. Click Insert > PivotTable, and then check Add this data to the Data Model in the Create PivotTable dialog box.

Can you use pivot table as data source?

Pivot tables are traditionally made based on a range of data (Range) or a structured table (Table). If we first create a pivot table based on the master data table it can serve as a data source.


1 Answers

Here's a swag at the calculation bits:

set.seed(1)
school  <- sample(c("BSA1", "BSA2", "HSA1"), 100, replace=T)
teacher <- sample(c("Tom", "Dick", "Harry"), 100, replace=T)
growth <- rnorm(100, 5, 3)

myDf <- data.frame(school, teacher, growth)

require(reshape2)

aggregate(growth ~ school + teacher, data =myDf, FUN=mean)

myDf.melt <- melt(myDf, measured="growth")
dcast(myDf.melt, school + teacher ~ ., fun.aggregate=mean, margins=c("school", "teacher"))

I've not addressed output formatting, only calculation. The resulting data frame should look like this:

   school teacher       NA
1    BSA1    Dick 4.663140
2    BSA1   Harry 4.310802
3    BSA1     Tom 5.505247
4    BSA1   (all) 4.670451
5    BSA2    Dick 6.110988
6    BSA2   Harry 5.007221
7    BSA2     Tom 4.337063
8    BSA2   (all) 5.196018
9    HSA1    Dick 4.508610
10   HSA1   Harry 4.890741
11   HSA1     Tom 4.721124
12   HSA1   (all) 4.717335
13  (all)   (all) 4.886576

That example uses the reshape2 package to handle the subtotals.

I think R is the right tool for the job here. I can totally understand not being sure how to get started on this analysis. I came to R from Excel a few years ago and it can be tough to grok at first. Let me point out four pro tips to help you get better answers in Stack Overflow:

1) provide data, even if simulated: you can see I simulated some data at the beginning of my answer. If you had provided that simulation it would have a) saved me time b) gotten you an answer that used your own data structure, not one I dreamed up and c) other people would have answered. I often skip questions with no data because I've grown tired of guessing about the data them being told my answer sucked because I guessed wrong.

2) Ask one clear question. "How do I do my work" is not a single clear question. "How do I take this example data and create subtotals in the aggregation like this example output" is a single specific question.

3) keep asking! We all get better with practice. You're trying to do more in R and less in Excel so you're clearly of above average intelligence. Keep using R and keep asking questions. It will all get easier in time.

4) Be careful with your words when you describe things. You say in your edited question you have a "list" of things. A list in R is a specific data structure. I'm suspicious you actually have a data frame and are using the term "list" in a generic sense. This can make for some confusion. It also illustrates why you want to provide your own data.

like image 58
JD Long Avatar answered Oct 13 '22 01:10

JD Long