Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr::first() to choose first non NA value

Tags:

I am looking for a way to extract the first and last non-NA value from each group. I am using dplyr::first() and dplyr::last(), but I can´t work out how to choose the first or last non-NA value.

library(dplyr)
set.seed(123)
d <- data.frame(
  group = rep(1:3, each = 3),
  year = rep(seq(2000,2002,1),3),
  value = sample(1:9, r = T))

#Introduce NA values in first row of group 2 and last row of group 3
d %>%
  mutate(
    value = case_when(
      group == 2 & year ==2000 ~ NA_integer_,
      group == 3 & year ==2002 ~ NA_integer_,
      TRUE ~ value))%>%
  group_by(group) %>% 
  mutate(
    first = dplyr::first(value),
    last = dplyr::last(value))

RESULT (with issue)

# A tibble: 9 x 5
# Groups:   group [3]
  group  year value first  last
  <int> <dbl> <int> <int> <int>
1     1  2000     3     3     4
2     1  2001     8     3     4
3     1  2002     4     3     4
4     2  2000    NA    NA     1
5     2  2001     9    NA     1
6     2  2002     1    NA     1
7     3  2000     5     5    NA
8     3  2001     9     5    NA
9     3  2002    NA     5    NA

Can you help me make the values in the "first" column for group 2 = 9 and the values in the "last" column from group 3 = 9?

I very much prefer a tidyverse solution if one such exists?

like image 810
Steen Harsted Avatar asked Sep 07 '18 10:09

Steen Harsted


People also ask

How do I replace Na in R?

The classic way to replace NA's in R is by using the IS.NA() function. The IS.NA() function takes a vector or data frame as input and returns a logical object that indicates whether a value is missing (TRUE or VALUE). Next, you can use this logical object to create a subset of the missing values and assign them a zero.

How to get the non-NaN values per row from a list?

Get First/Last Non-NaN Values per row The first solution to get the non-NaN values per row from a list of columns use the next steps: .fillna (method='bfill', axis=1) - to fill all non-NaN values from the last to the first one; axis=1 - means columns and the result Series will have all non-null values per given row:

How to replace Na with a value in tidyr?

It is useful if you want to convert an annoying value to NA. A modified version of x that replaces any values that are equal to y with NA. coalesce () to replace missing values with a specified value. tidyr::replace_na () to replace NA with a value. recode () to more generally replace values.

How to replace a certain value of a vector with Na?

Now, we can use the na_if function to replace a certain value of our example vector with NA: As you can see based on the previous R code and the output of the RStudio console, we replaced the value 5 of our vector with NA.

How to get the first argument of a vector with non-NA value?

which.max () method returns the first argument that is encountered within the vector with non-na value. The method has the following syntax in R : [1] "Original Vector" [1] NA 1 3 NA 2 NA 5 7 [1] "First non-na index" [1] 2


Video Answer


1 Answers

Use na.omit, compare:

first(c(NA, 11, 22))
# [1] NA

first(na.omit(c(NA, 11, 22)))
# [1] 11

Using example data:

d %>%
  mutate(
    value = case_when(
      group == 2 & year ==2000 ~ NA_integer_,
      group == 3 & year ==2002 ~ NA_integer_,
      TRUE ~ value))%>%
  group_by(group) %>% 
  mutate(
    first = dplyr::first(na.omit(value)),
    last = dplyr::last(na.omit(value)))

# # A tibble: 9 x 5
# # Groups:   group [3]
#   group  year value first  last
#   <int> <dbl> <int> <int> <int>
# 1     1  2000     3     3     4
# 2     1  2001     8     3     4
# 3     1  2002     4     3     4
# 4     2  2000    NA     9     1
# 5     2  2001     9     9     1
# 6     2  2002     1     9     1
# 7     3  2000     5     5     9
# 8     3  2001     9     5     9
# 9     3  2002    NA     5     9
like image 198
zx8754 Avatar answered Oct 14 '22 07:10

zx8754