I currently have a data.frame (X) with the following structure:
Number Observation
1 34
2 Example
3 Example34%
4 Example
5 34
My desired output is 2 data frames, one which contains only the double observations (i.e 34) and one which contains everything else (Characters and Characters with numbers and %).
I have been able to obtain the number observations using:
y <- x[str_detect(x$Observation,("([0-9])")),]
But it also includes observation with characters and numbers. When I negate it !str_detect(...) i only get a character output leaving out Example34%. Is there a way to str_detect only number values and then !that to obtain everything else?
Example of desired output:
Using anchors for the start ^
and end $
of the regex
library(tidyverse)
data_example <- tibble::tribble(
~Number, ~Observation,
1L, "34",
2L, "Example",
3L, "Example34%",
4L, "Example",
5L, "34"
)
tidy_solution <- data_example %>%
mutate(
just_numbers = Observation %>% str_extract("^[:digit:]+$"),
just_not_numbers = if_else(just_numbers %>% is.na(), Observation, NA_character_),
full_ans = coalesce(just_numbers, just_not_numbers)
)
tidy_solution
#> # A tibble: 5 x 5
#> Number Observation just_numbers just_not_numbers full_ans
#> <int> <chr> <chr> <chr> <chr>
#> 1 1 34 34 <NA> 34
#> 2 2 Example <NA> Example Example
#> 3 3 Example34% <NA> Example34% Example34%
#> 4 4 Example <NA> Example Example
#> 5 5 34 34 <NA> 34
a <- tidy_solution %>%
select(Number, just_numbers) %>%
na.omit()
a
#> # A tibble: 2 x 2
#> Number just_numbers
#> <int> <chr>
#> 1 1 34
#> 2 5 34
b <- tidy_solution %>%
select(Number, just_not_numbers) %>%
na.omit()
Created on 2020-06-10 by the reprex package (v0.3.0)
A way would be to find one of the output and use anti_join
to get another one.
library(dplyr)
library(stringr)
df1 <- df %>% filter(str_detect(Observation, '[A-Za-z]'))
df2 <- anti_join(df, df1)
df1
# Number Observation
#1 2 Example
#2 3 Example34%
#3 4 Example
df2
# Number Observation
#1 1 34
#2 5 34
In df1
we include rows that have any alphabet and df2
is everything else.
data
df <- structure(list(Number = 1:5, Observation = c("34", "Example",
"Example34%", "Example", "34")), class = "data.frame", row.names=c(NA, -5L))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With