Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reshaping a data frame --- changing rows to columns

Tags:

r

reshape

Suppose that we have a data frame that looks like

set.seed(7302012)

county         <- rep(letters[1:4], each=2)
state          <- rep(LETTERS[1], times=8)
industry       <- rep(c("construction", "manufacturing"), 4)
employment     <- round(rnorm(8, 100, 50), 0)
establishments <- round(rnorm(8, 20, 5), 0)

data <- data.frame(state, county, industry, employment, establishments)

  state county      industry employment establishments
1     A      a  construction        146             19
2     A      a manufacturing        110             20
3     A      b  construction        121             10
4     A      b manufacturing         90             27
5     A      c  construction        197             18
6     A      c manufacturing         73             29
7     A      d  construction         98             30
8     A      d manufacturing        102             19

We'd like to reshape this so that each row represents a (state and) county, rather than a county-industry, with columns construction.employment, construction.establishments, and analogous versions for manufacturing. What is an efficient way to do this?

One way is to subset

construction <- data[data$industry == "construction", ]
names(construction)[4:5] <- c("construction.employment", "construction.establishments")

And similarly for manufacturing, then do a merge. This isn't so bad if there are only two industries, but imagine that there are 14; this process would become tedious (though made less so by using a for loop over the levels of industry).

Any other ideas?

like image 733
Charlie Avatar asked Jul 30 '12 16:07

Charlie


People also ask

How do you reshape a data frame?

melt() function is used to reshape a DataFrame from a wide to a long format. It is useful to get a DataFrame where one or more columns are identifier variables, and the other columns are unpivoted to the row axis leaving only two non-identifier columns named variable and value by default.

How do I switch rows to columns in R?

Rotating or transposing R objects You can rotate the data. frame so that the rows become the columns and the columns become the rows. That is, you transpose the rows and columns. You simply use the t() command.

How do you reshape a data frame from long to wide?

To summarize, if you need to reshape a Pandas dataframe from long to wide, use pd. pivot() . If you need to reshape a Pandas dataframe from wide to long, use pd. melt() .


2 Answers

This can be done in base R reshape, if I understand your question correctly:

reshape(data, direction="wide", idvar=c("state", "county"), timevar="industry")
#   state county employment.construction establishments.construction
# 1     A      a                     146                          19
# 3     A      b                     121                          10
# 5     A      c                     197                          18
# 7     A      d                      98                          30
#   employment.manufacturing establishments.manufacturing
# 1                      110                           20
# 3                       90                           27
# 5                       73                           29
# 7                      102                           19 
like image 138
A5C1D2H2I1M1N2O1R2T1 Avatar answered Nov 15 '22 07:11

A5C1D2H2I1M1N2O1R2T1


Also using the reshape package:

library(reshape) 
m <- reshape::melt(data) 
cast(m, state + county~...) 

Yielding:

> cast(m, state + county~...) 
  state county construction_employment construction_establishments manufacturing_employment manufacturing_establishments
1     A      a                     146                          19                      110                           20
2     A      b                     121                          10                       90                           27
3     A      c                     197                          18                       73                           29
4     A      d                      98                          30                      102                           19

I personally use the base reshape so I probably should have shown this using reshape2 (Wickham) but forgot there was a reshape2 package. Slightly different:

library(reshape2) 
m <- reshape2::melt(data) 
dcast(m, state + county~...) 
like image 39
Tyler Rinker Avatar answered Nov 15 '22 07:11

Tyler Rinker