Lets pretend I have something like this: <pre class="prettyprint"><code>df <- data.frame( PERSON = c("Peter", "Peter", "Marcel" , "Lisa", "Lisa"), FRUIT = c("Apple", "Peach","Apple", "Apple", "Peach" ), A = c(100, 200, 100, 200, 300), B=c(1,2,3,4,5) ) df$PERSON <- as.factor(df$Person) df$FRUIT <- factor(df$FRUIT, levels = c("Apple", "Peach", "Coconut")) </code></pre> Which resulsts in <pre class="prettyprint"><code>str(df): 'data.frame': 5 obs. of 4 variables: $ PERSON: Factor w/ 3 levels "Lisa","Marcel",..: 3 3 2 1 1 $ FRUIT : Factor w/ 3 levels "Apple","Peach",..: 1 2 1 1 2 $ A : num 100 200 100 200 300 $ B : num 1 2 3 4 5 </code></pre> I want to expand this data, frame so that for every PERSON there are all levels of FRUIT present, like this: <pre class="prettyprint"><code> Person FRUIT A B 1 Peter Apple 100 1 2 Peter Peach 200 2 3 Peter Coconut 0 0 4 Marcel Apple 100 3 5 Marcel Peach 0 0 6 Marcel Coconut 0 0 7 Lisa Apple 200 4 8 Lisa Peach 300 5 9 Lisa Coconut 0 0 </code></pre> Missing values for <code>A</code> and <code>B</code> should be filled with 0. I tried <code>tidyr::complete(df$FRUIT, 0)</code>, but it seems, that I used this function wrong.

The <code>complete</code> takes the first argument as 'data', followed by the columns to expand. By default, the <code>fill</code> is NA, but we can change it to 0 by specifying it in a <code>list</code>. <pre class="prettyprint"><code>complete(df, PERSON, FRUIT, fill = list(A=0, B = 0)) </code></pre>

How to complete missing factor levels in data frame?

Tags:

dataframe

r

tidyr

Lets pretend I have something like this:

df <- data.frame(
      PERSON = c("Peter", "Peter", "Marcel" , "Lisa", "Lisa"),        
      FRUIT = c("Apple", "Peach","Apple", "Apple", "Peach" ), 
      A = c(100, 200, 100, 200, 300), 
      B=c(1,2,3,4,5) )
df$PERSON <- as.factor(df$Person)
df$FRUIT <- factor(df$FRUIT, levels = c("Apple", "Peach", "Coconut"))

Which resulsts in

str(df): 'data.frame':  5 obs. of  4 variables:
$ PERSON: Factor w/ 3 levels "Lisa","Marcel",..: 3 3 2 1 1
$ FRUIT : Factor w/ 3 levels "Apple","Peach",..: 1 2 1 1 2
$ A     : num  100 200 100 200 300
$ B     : num  1 2 3 4 5

I want to expand this data, frame so that for every PERSON there are all levels of FRUIT present, like this:

 Person FRUIT   A B
1  Peter Apple 100 1
2  Peter Peach 200 2
3  Peter Coconut 0 0
4 Marcel Apple 100 3
5 Marcel Peach 0 0
6 Marcel Coconut 0 0
7   Lisa Apple 200 4
8   Lisa Peach 300 5
9   Lisa Coconut 0 0

Missing values for A and B should be filled with 0.

I tried tidyr::complete(df$FRUIT, 0), but it seems, that I used this function wrong.

700

asked Oct 10 '16 15:10

barracuda317

Video Answer

1 Answers

The complete takes the first argument as 'data', followed by the columns to expand. By default, the fill is NA, but we can change it to 0 by specifying it in a list.

complete(df, PERSON, FRUIT, fill = list(A=0, B = 0))

111

answered Oct 16 '22 13:10

akrun

Related questions
                            
                                cumsum by group [duplicate]
                            
                                How to POST multipart/related content with httr (for Google Drive API)
                            
                                data difference in `as.POSIXct` with Excel
                            
                                Compare strings with logical operator in R
                            
                                R Read abbreviated month form a date that is not in English
                            
                                Adding values in two data.tables
                            
                                from data table, randomly select one row per group
                            
                                Count cumulative unique factors separated by semicolon Grouped by Name
                            
                                How to capture html output as png in R
                            
                                R: decimal ceiling
                            
                                String splitting data.table column produces NAs
                            
                                Using if else statement for multiple conditions
                            
                                two dygraph in RStudio Viewer pane
                            
                                R Plotly: Cannot re-arrange x-axis when axis type is category
                            
                                Adjusting white space between titles and the edge of the plot
                            
                                R: ggplot stacked bar chart with counts on y axis but percentage as label
                            
                                How to find Consecutive Numbers Among multiple Arrays?
                            
                                Read multiple xlsx files with multiple sheets into one R data frame
                            
                                Sink is full when calling rmarkdown::render
                            
                                How to read pdf file in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With