Transpose / reshape dataframe without "timevar" from long to wide format

Tags:

I have a data frame that follows the below long Pattern:

   Name          MedName   Name1    atenolol 25mg   Name1     aspirin 81mg   Name1 sildenafil 100mg   Name2    atenolol 50mg   Name2   enalapril 20mg

And would like to get below (I do not care if I can get the columns to be named this way, just want the data in this format):

   Name   medication1    medication2      medication3   Name1 atenolol 25mg   aspirin 81mg sildenafil 100mg   Name2 atenolol 50mg enalapril 20mg             NA

Through this very site I have become familiarish with the reshape/reshape2 package, and have went through several attempts to try to get this to work but have thus far failed.

When I try dcast(dataframe, Name ~ MedName, value.var='MedName') I just get a bunch of columns that are flags of the medication names (values that get transposed are 1 or 0) example:

 Name  atenolol 25mg  aspirin 81mg Name1              1             1 Name2              0             0

I also tried a dcast(dataset, Name ~ variable) after I melted the dataset, however this just spits out the following (just counts how many meds each person has):

 Name  MedName Name1        3 name2        2

Finally, I tried to melt the data and then reshape using idvar="Name" timevar="variable" (of which all just are Mednames), however this does not seem built for my issue since if there are multiple matches to the idvar, the reshape just takes the first MedName and ignores the rest.

Does anyone know how to do this using reshape or another R function? I realize that there probably is a way to do this in a more messy manner with some for loops and conditionals to basically split and re-paste the data, but I was hoping there was a more simple solution. Thank you so much!

392

asked Jul 04 '12 05:07

Hotamd6

2 Answers

With the data.table package, this could easily be solved with the new rowid function:

library(data.table) dcast(setDT(d1),        Name ~ rowid(Name, prefix = "medication"),        value.var = "MedName")

which gives:

   Name    medication1     medication2       medication3 1 Name1  atenolol 25mg    aspirin 81mg  sildenafil 100mg 2 Name2  atenolol 50mg  enalapril 20mg              <NA>

Another method (commonly used before version 1.9.7):

dcast(setDT(d1)[, rn := 1:.N, by = Name],        Name ~ paste0("medication",rn),        value.var = "MedName")

giving the same result.

A similar approach, but now using the dplyr and tidyr packages:

library(dplyr) library(tidyr) d1 %>%   group_by(Name) %>%   mutate(rn = paste0("medication",row_number())) %>%   spread(rn, MedName)

which gives:

Source: local data frame [2 x 4] Groups: Name [2]      Name   medication1    medication2      medication3   (fctr)         (chr)          (chr)            (chr) 1  Name1 atenolol 25mg   aspirin 81mg sildenafil 100mg 2  Name2 atenolol 50mg enalapril 20mg               NA

189

answered Sep 20 '22 07:09

Jaap

Assuming your data is in the object dataset:

library(plyr) ## Add a medication index data_with_index <- ddply(dataset, .(Name), mutate,                           index = paste0('medication', 1:length(Name)))     dcast(data_with_index, Name ~ index, value.var = 'MedName')  ##    Name   medication1    medication2      medication3 ## 1 Name1 atenolol 25mg   aspirin 81mg sildenafil 100mg ## 2 Name2 atenolol 50mg enalapril 20mg             <NA>

answered Sep 24 '22 07:09

mnel

Related questions
                            
                                Merge three different columns into a date in R
                            
                                Matching multiple patterns
                            
                                Forecasting time series data
                            
                                Merging multiple rasters in R
                            
                                What is the right way to multiply data frame by vector?
                            
                                How to adjust facet size manually
                            
                                R: How to filter/subset a sequence of dates
                            
                                Delete columns/rows with more than x% missing
                            
                                How to transpose a dataframe in tidyverse?
                            
                                How do I strip dollar signs ($) from data/ escape special characters in R?
                            
                                linear regression "NA" estimate just for last coefficient
                            
                                Is there a way to knitr markdown straight out of your workspace using RStudio?
                            
                                Create new column with dplyr mutate and substring of existing column
                            
                                Change plot title sizes in a facet_wrap multiplot
                            
                                Use filter in dplyr conditional on an if statement in R
                            
                                Saving and loading data.frames [duplicate]
                            
                                How to access to specify file in subfolder without change working directory In R?
                            
                                Install binary zipped R package via command line
                            
                                Check whether two vectors contain the same (unordered) elements in R
                            
                                How to remove duplicated column names in R?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Transpose / reshape dataframe without "timevar" from long to wide format

Tags:

r

r-faq

reshape

transpose

Hotamd6

People also ask

2 Answers

Jaap

mnel

Recent Activity

Donate For Us