How do I change the following table from: <pre class="prettyprint"><code>Type Name Answer n TypeA Apple Yes 5 TypeA Apple No 10 TypeA Apple DK 8 TypeA Apple NA 20 TypeA Orange Yes 6 TypeA Orange No 11 TypeA Orange DK 8 TypeA Orange NA 23 </code></pre> Change to: <pre class="prettyprint"><code>Type Name Yes No DK NA TypeA Apple 5 10 8 20 TypeA Orange 6 11 8 23 </code></pre> I used the following codes to get the first table. <pre class="prettyprint"><code>df_1 <- df %>% group_by(Type, Name, Answer) %>% tally() </code></pre> Then I tried to use the spread command to get to the 2nd table, but I got the following error message: <blockquote> "Error: All columns must be named" </blockquote> <pre class="prettyprint"><code>df_2 <- spread(df_1, Answer) </code></pre>

Following on the comment from ayk, I'm providing an example. It looks to me like when you have a data_frame with a column of either a factor or character class that has values of NA, this cannot be spread without either removing them or re-classifying the data. This is specific to a data_frame (note the dplyr class with the underscore in the name), as this works in my example when you have values of NA in a data.frame. For example, a slightly modified version of the example above: Here is the dataframe <pre class="prettyprint"><code>library(dplyr) library(tidyr) df_1 <- data_frame(Type = c("TypeA", "TypeA", "TypeB", "TypeB"), Answer = c("Yes", "No", NA, "No"), n = 1:4) df_1 </code></pre> Which gives a data_frame that looks like this <pre class="prettyprint"><code>Source: local data frame [4 x 3] Type Answer n (chr) (chr) (int) 1 TypeA Yes 1 2 TypeA No 2 3 TypeB NA 3 4 TypeB No 4 </code></pre> Then, when we try to tidy it, we get an error message: <pre class="prettyprint"><code>df_1 %>% spread(key=Answer, value=n) Error: All columns must be named </code></pre> But if we remove the NA's then it 'works': <pre class="prettyprint"><code>df_1 %>% filter(!is.na(Answer)) %>% spread(key=Answer, value=n) Source: local data frame [2 x 3] Type No Yes (chr) (int) (int) 1 TypeA 2 1 2 TypeB 4 NA </code></pre> However, removing the NAs may not give you the desired result: i.e. you might want those to be included in your tidied table. You could modify the data directly to change the NAs to a more descriptive value. Alternatively, you could change your data to a data.frame and then it spreads just fine: <pre class="prettyprint"><code>as.data.frame(df_1) %>% spread(key=Answer, value=n) Type No Yes NA 1 TypeA 2 1 NA 2 TypeB 4 NA 3 </code></pre>

How to use the spread function properly in tidyr

Tags:

r

dplyr

tidyr

spread

How do I change the following table from:

Type    Name    Answer     n
TypeA   Apple   Yes        5
TypeA   Apple   No        10
TypeA   Apple   DK         8
TypeA   Apple   NA        20
TypeA   Orange  Yes        6
TypeA   Orange  No        11
TypeA   Orange  DK         8
TypeA   Orange  NA        23

Change to:

Type    Name    Yes   No   DK   NA  
TypeA   Apple   5     10   8    20
TypeA   Orange  6     11   8    23

I used the following codes to get the first table.

df_1 <- 
  df %>% 
  group_by(Type, Name, Answer) %>% 
  tally()

Then I tried to use the spread command to get to the 2nd table, but I got the following error message:

"Error: All columns must be named"

df_2 <- spread(df_1, Answer)

378

asked Jan 08 '16 19:01

ayk

2 Answers

Following on the comment from ayk, I'm providing an example. It looks to me like when you have a data_frame with a column of either a factor or character class that has values of NA, this cannot be spread without either removing them or re-classifying the data. This is specific to a data_frame (note the dplyr class with the underscore in the name), as this works in my example when you have values of NA in a data.frame. For example, a slightly modified version of the example above:

Here is the dataframe

library(dplyr)
library(tidyr)
df_1 <- data_frame(Type = c("TypeA", "TypeA", "TypeB", "TypeB"),
                   Answer = c("Yes", "No", NA, "No"),
                   n = 1:4)
df_1

Which gives a data_frame that looks like this

Source: local data frame [4 x 3]

   Type Answer     n
  (chr)  (chr) (int)
1 TypeA    Yes     1
2 TypeA     No     2
3 TypeB     NA     3
4 TypeB     No     4

Then, when we try to tidy it, we get an error message:

df_1 %>% spread(key=Answer, value=n)
Error: All columns must be named

But if we remove the NA's then it 'works':

df_1 %>%
    filter(!is.na(Answer)) %>%
    spread(key=Answer, value=n)
Source: local data frame [2 x 3]

   Type    No   Yes
  (chr) (int) (int)
1 TypeA     2     1
2 TypeB     4    NA

However, removing the NAs may not give you the desired result: i.e. you might want those to be included in your tidied table. You could modify the data directly to change the NAs to a more descriptive value. Alternatively, you could change your data to a data.frame and then it spreads just fine:

as.data.frame(df_1) %>% spread(key=Answer, value=n)
   Type No Yes NA
1 TypeA  2   1 NA
2 TypeB  4  NA  3

answered Sep 28 '22 08:09

Nicholas G Reich

I think only tidyr is needed to get from df_1 to df_2.

library(magrittr)
df_1 <- read.csv(text="Type,Name,Answer,n\nTypeA,Apple,Yes,5\nTypeA,Apple,No,10\nTypeA,Apple,DK,8\nTypeA,Apple,NA,20\nTypeA,Orange,Yes,6\nTypeA,Orange,No,11\nTypeA,Orange,DK,8\nTypeA,Orange,NA,23", stringsAsFactors=F)

df_2 <- df_1 %>% 
  tidyr::spread(key=Answer, value=n)

Output:

   Type   Name DK No Yes NA
1 TypeA  Apple  8 10   5 20
2 TypeA Orange  8 11   6 23

answered Sep 28 '22 07:09

wibeasley

Related questions
                            
                                ggplot2: geom_bar with group, position_dodge and fill
                            
                                how do I split a dataframe by row into chunks of n, apply a function and combine?
                            
                                R: Merging multiple columns into one by group (twice in the same dataframe)
                            
                                calculate the number of digits in a numeric vector in R
                            
                                clearShapes() not working -- leaflet() for R
                            
                                Zooming into State to view ZipCode using R Leaflet
                            
                                Delay on sliderinput
                            
                                Baffling error using dataprep function in R Synth package
                            
                                Julia version of R's Match?
                            
                                How to invert the colors of a ggmap raster image in R?
                            
                                Adding multiple columns to a data.table, where column names are held in a vector
                            
                                RForcecom accessing unknown field names
                            
                                Applying as.numeric only to elements of a list that can be coerced to numeric (in R)
                            
                                Plot data from lists in R
                            
                                Download Gmail Mail Content using R
                            
                                How to change dendrogram labels in r
                            
                                r - copy value based on match in another column
                            
                                Remove duplicates in two ggplot legend
                            
                                How to use a lookup table in R without creating duplicates?
                            
                                Missing Ribbon in ggplot2

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With