Reshaping a data frame --- changing rows to columns

Tags:

reshape

Suppose that we have a data frame that looks like

set.seed(7302012)

county         <- rep(letters[1:4], each=2)
state          <- rep(LETTERS[1], times=8)
industry       <- rep(c("construction", "manufacturing"), 4)
employment     <- round(rnorm(8, 100, 50), 0)
establishments <- round(rnorm(8, 20, 5), 0)

data <- data.frame(state, county, industry, employment, establishments)

  state county      industry employment establishments
1     A      a  construction        146             19
2     A      a manufacturing        110             20
3     A      b  construction        121             10
4     A      b manufacturing         90             27
5     A      c  construction        197             18
6     A      c manufacturing         73             29
7     A      d  construction         98             30
8     A      d manufacturing        102             19

We'd like to reshape this so that each row represents a (state and) county, rather than a county-industry, with columns construction.employment, construction.establishments, and analogous versions for manufacturing. What is an efficient way to do this?

One way is to subset

construction <- data[data$industry == "construction", ]
names(construction)[4:5] <- c("construction.employment", "construction.establishments")

And similarly for manufacturing, then do a merge. This isn't so bad if there are only two industries, but imagine that there are 14; this process would become tedious (though made less so by using a for loop over the levels of industry).

Any other ideas?

733

asked Jul 30 '12 16:07

Charlie

2 Answers

This can be done in base R reshape, if I understand your question correctly:

reshape(data, direction="wide", idvar=c("state", "county"), timevar="industry")
#   state county employment.construction establishments.construction
# 1     A      a                     146                          19
# 3     A      b                     121                          10
# 5     A      c                     197                          18
# 7     A      d                      98                          30
#   employment.manufacturing establishments.manufacturing
# 1                      110                           20
# 3                       90                           27
# 5                       73                           29
# 7                      102                           19

138

answered Nov 15 '22 07:11

A5C1D2H2I1M1N2O1R2T1

Also using the reshape package:

library(reshape) 
m <- reshape::melt(data) 
cast(m, state + county~...)

Yielding:

> cast(m, state + county~...) 
  state county construction_employment construction_establishments manufacturing_employment manufacturing_establishments
1     A      a                     146                          19                      110                           20
2     A      b                     121                          10                       90                           27
3     A      c                     197                          18                       73                           29
4     A      d                      98                          30                      102                           19

I personally use the base reshape so I probably should have shown this using reshape2 (Wickham) but forgot there was a reshape2 package. Slightly different:

library(reshape2) 
m <- reshape2::melt(data) 
dcast(m, state + county~...)

answered Nov 15 '22 07:11

Tyler Rinker

Related questions
                            
                                Associative array from string
                            
                                Is it possible to truncate output when viewing the contents of dataframes?
                            
                                `With` usage inside function (wrapper)
                            
                                Column alignment in xtable output
                            
                                Bootstrap Confidence Intervals in R
                            
                                How do I count the number of observations at given intervals in R?
                            
                                How do I make an array of classes in R?
                            
                                ggplot geom_tile spacing with facets
                            
                                R cleaning up a character and converting it into a numeric
                            
                                Adding points to a geom_tile layer in ggplot2
                            
                                Ignoring values or NAs in the sample function
                            
                                Excel like column operations in R dataframe
                            
                                R: "apply" statement to take the sum of the number of non-NA values across multiple columns
                            
                                Read multiple files under different directories in R
                            
                                Make a list of equal length vectors
                            
                                create an arrow with gradient color
                            
                                Error in read.csv with colClasses: scan() expected 'a real' got 'NULL'
                            
                                Adding zeros in front of an vector
                            
                                R Plot Filled Longitude-Latitude Grid Cells on Map
                            
                                Merging data.tables uses more than 10 GB RAM

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With