Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem. I have a data.frame such as: <pre class="prettyprint"><code>df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) df1 id val 1 a NA 2 a NA 3 a NA 4 a NA 5 b 1 6 b 2 7 b 2 8 b 3 9 c NA 10 c 2 11 c NA 12 c 3 </code></pre> and I want to get rid of all the NA values (easy enough using e.g. filter() ) but make sure that if this removes all of one id value (in this case it removes every instance of "a") that one extra row is inserted of (e.g.) a = 0 so that: <pre class="prettyprint"><code> id val 1 a 0 2 b 1 3 b 2 4 b 2 5 b 3 6 c 2 7 c 3 </code></pre> obviously easy enough to do this in a roundabout way but I was wondering if there's a tidy/elegant way to do this. I thought tidyr::complete() might help but not entirely sure how to apply it to a case like this I don't care about the order of the rows Cheers! edit: updated with clearer desired output. might make desired answers submitted before that a bit less clear

Another idea using <code>dplyr</code>, <pre class="prettyprint"><code>library(dplyr) df1 %>% group_by(id) %>% mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>% na.omit() </code></pre> which gives, <blockquote> <pre class="prettyprint"><code># A tibble: 5 x 2 # Groups: id [2] id val <fct> <dbl> 1 a 0 2 b 1 3 b 2 4 b 2 5 b 3 </code></pre> </blockquote>

Replace all NA values for variable with one row equal to 0

Tags:

r

na

dplyr

Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.

I have a data.frame such as:

df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),
                  val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))

df1

   id val
1   a  NA
2   a  NA
3   a  NA
4   a  NA
5   b   1
6   b   2
7   b   2
8   b   3
9   c  NA
10  c   2
11  c  NA
12  c   3

and I want to get rid of all the NA values (easy enough using e.g. filter() ) but make sure that if this removes all of one id value (in this case it removes every instance of "a") that one extra row is inserted of (e.g.) a = 0

so that:

obviously easy enough to do this in a roundabout way but I was wondering if there's a tidy/elegant way to do this. I thought tidyr::complete() might help but not entirely sure how to apply it to a case like this

I don't care about the order of the rows

Cheers!

edit: updated with clearer desired output. might make desired answers submitted before that a bit less clear

389

asked Jan 03 '19 12:01

Robert Hickman

1 Answers

Another idea using dplyr,

library(dplyr)

df1 %>% 
 group_by(id) %>% 
 mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>% 
 na.omit()

which gives,

# A tibble: 5 x 2
# Groups:   id [2]
  id      val
  <fct> <dbl>
1 a         0
2 b         1
3 b         2
4 b         2
5 b         3

100

answered Oct 02 '22 15:10

Sotos

Related questions
                            
                                Categorical bubble plot for mapping studies
                            
                                How to write x-axis title with text and superscript ggplot2
                            
                                Numeric variables converted to factors when reading a CSV file
                            
                                Parsing "->" assignment operator in R
                            
                                How to get the zoom level from the leaflet map in R/shiny?
                            
                                Concatenate column names in one column conditional on using mutate, across and case_when
                            
                                Creating a facet_wrap plot with ggplot2 with different annotations in each plot
                            
                                How does local() differ from other approaches to closure in R?
                            
                                What command converts knitr R Markdown into Stack-Exchange-friendly Markdown?
                            
                                How to adjust `binwidth` in ggplot2?
                            
                                Suppress error message in R
                            
                                Different axis limits per facet in ggplot2
                            
                                Changing behaviour of stats::lag when loading dplyr package
                            
                                Lazy loading error in R package
                            
                                SpatialPoints and SpatialPointsDataframe
                            
                                In ggplot2 and facet_wrap, how to remove all margins and padding yet keep strip.text?
                            
                                ggplot renaming facet labels in facet_wrap
                            
                                How to get the nth row from data frame in R
                            
                                is it better to use integer64, numeric, or character for large integer id numbers?
                            
                                How to type tilde in formulas in RMarkdown

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With