I'm creating a dataframe containing the number of incidents of a certain kind in each state in each year from 2000 to 2010 (pretend that they are gun incidents): <pre class="prettyprint"><code>states <- c('Texas', 'Texas', 'Arizona', 'California', 'California') incidents <- c(1, 1, 2, 1, 4) years <- c(2000, 2008, 2004, 2002, 2007) DF <- data.frame(states, incidents, years) > DF states incidents years 1 Texas 1 2000 2 Texas 1 2008 3 Arizona 2 2004 4 California 1 2002 5 California 4 2007 </code></pre> I want to insert rows to complete the dataset, e.g. zeros for Texas for 2001, 2002, 2003, ... 2007, and for 2009 and 2010. And likewise, zeros for Arizona for all years except 2004. Same thing for California. How can I do this?

You can use <code>tidyr::complete</code> to fill in missing years (<code>2010:2010</code>) and values with <code>0</code>. <pre class="prettyprint"><code>library(tidyr) DFfilled <- DF %>% complete(states, years = 2000:2010, fill = list(incidents = 0)) %>% as.data.frame() </code></pre> PS: If there are entries with year <code>2010</code> in your data (now it's only up to <code>2008</code>) you can use <code>full_seq(years, 1)</code> instead of <code>2000:2010</code>.

Inserting missing years to complete a data.frame

Tags:

dataframe

r

I'm creating a dataframe containing the number of incidents of a certain kind in each state in each year from 2000 to 2010 (pretend that they are gun incidents):

states <- c('Texas', 'Texas', 'Arizona', 'California', 'California')
incidents <- c(1, 1, 2, 1, 4)
years <- c(2000, 2008, 2004, 2002, 2007)

DF <- data.frame(states, incidents, years)

> DF
      states incidents years
1      Texas         1  2000
2      Texas         1  2008
3    Arizona         2  2004
4 California         1  2002
5 California         4  2007

I want to insert rows to complete the dataset, e.g. zeros for Texas for 2001, 2002, 2003, ... 2007, and for 2009 and 2010. And likewise, zeros for Arizona for all years except 2004. Same thing for California.

How can I do this?

399

asked Mar 22 '18 15:03

wwl

1 Answers

You can use tidyr::complete to fill in missing years (2010:2010) and values with 0.

library(tidyr)
DFfilled <- DF %>%
    complete(states, years = 2000:2010, 
             fill = list(incidents = 0)) %>%
    as.data.frame()

PS:
If there are entries with year 2010 in your data (now it's only up to 2008) you can use full_seq(years, 1) instead of 2000:2010.

answered Oct 07 '22 02:10

pogibas

Related questions
                            
                                Suppress Messages from zip in R
                            
                                map a vector of characters to lm formula in r
                            
                                Reactive CSS properties in R Shiny
                            
                                Automatically stack every nth column of a dataframe
                            
                                convert matrix to numeric data frame
                            
                                What does the error "the condition has length > 1 and only the first element will be used" mean? [duplicate]
                            
                                Split data.frame into groups by column name
                            
                                R Subsetting vector with logical matrix
                            
                                Show Akaike Criteria in Stargazer
                            
                                Modification of a Vector based on elements sequence
                            
                                R How to remove labels on dendrogram?
                            
                                ggplot geom_bar with stat = "sum"
                            
                                How to translate title of `abstract` in a pandoc's markdown (e.g., Rmd) document?
                            
                                Add line legend to geom_sf
                            
                                Error - Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs)= etc
                            
                                Vectorised time zone conversion with lubridate
                            
                                Adding space around figures in RMarkdown
                            
                                Time Series Forecasting using Support Vector Machine (SVM) in R
                            
                                ggplot alpha = 0 not working
                            
                                Subset a list without knowing its structure in r

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With