Here is a smaple of data set:
df<-read.table (text="Id Name Surname Group A1 A2 A3 Type1 Gen1 B1 B2 B3 Type2 Gen2
116 Matt Dabi A 4 4 4 AB N 3 3 4 G N
116 Matt Dabi D 4 3 2 D N 4 3 2 G N
116 Matt Dabi Q NA NA NA NA NA NA NA NA NA NA
116 Matt Dabi B NA NA NA NA NA NA NA NA NA NA
", header=TRUE)
Id Name Surname Group A1 A2 A3 Type1 Gen1
116 Matt Dabi A 4 4 4 AB N
116 Matt Dabi D 4 3 2 D N
116 Matt Dabi Q 3 3 4 G N
116 Matt Dabi B 4 3 2 G N
I have tried df %>% na.omit()
With multiple patterns, one option is to create a spec with build_longer_spec and then use pivot_longer_spec
library(tidyr)
library(dplyr)
library(stringr)
spec <- df %>%
build_longer_spec(cols = matches("^[A-Za-z]+\\d+$"),
names_to = ".value", names_pattern = ".*(\\d+)") %>%
mutate(`.value` = case_when(str_detect(`.name`, "^[AB]\\d+$") ~
str_c('A', `.value`),
str_detect(`.name`, 'Type') ~ 'Type1',
str_detect(`.name`, 'Gen') ~ 'Gen1'))
pivot_longer_spec(df, spec, values_drop_na = TRUE)
# A tibble: 4 × 9
Id Name Surname Group A1 A2 A3 Type1 Gen1
<int> <chr> <chr> <chr> <int> <int> <int> <chr> <chr>
1 116 Matt Dabi A 4 4 4 AB N
2 116 Matt Dabi A 3 3 4 G N
3 116 Matt Dabi D 4 3 2 D N
4 116 Matt Dabi D 4 3 2 G N
Here another approach using pivot_wider:
df %>%
pivot_longer(starts_with(c("A", "B")), names_to="ID") %>%
na.omit() %>%
mutate(ID=sub("B", "A", ID)) %>%
pivot_wider(names_from=c(ID))
In case of more columns, one can use:
df %>%
pivot_longer(matches("[A-Z][1-9]"), names_to="ID") %>%
na.omit() %>%
mutate(ID = sub("[B-Z]", "A", ID)) %>%
pivot_wider(names_from = ID)
given that the rows are uniquely identified otherwise.
# A tibble: 4 x 7
Id Name Surname Group A1 A2 A3
<int> <chr> <chr> <chr> <int> <int> <int>
1 116 Matt Dabi A 4 4 4
2 116 Matt Dabi D 4 3 2
3 116 Matt Dabi Q 4 3 3
4 116 Matt Dabi B 4 2 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With