Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tidy data with variable in intermittent rows

I have datalogger that inserts a row with a timestamp every time the logger is turned on. The timestamp string is always the same format, but there are an inconsistent number of readings per timestamp.

How do I tidy the timestamp rows into a time variable?

This previous question is close, except I want the data in the extra rows to be a variable, not a header (it's also four years old, and I suspect there's a more elegant tidyverse solution these days): Tidy and Cast Data With Headers Stuck in Rows

library(tidyverse)

df_have <- tribble(
  ~site, ~n,  ~val,
  NA,  "Start 11:22:33", NA,
  "A", "N=1", .1,
  "A", "N=2", .3,
  NA,  "Start 12:33:44", NA,
  "B", "N=1", .2,
  "B", "N=2", .4,
  "B", "N=3", .6
)

df_want <- tribble(
  ~site, ~time, ~n,  ~val,
  "A", "11:22:33", "N=1", .1,
  "A", "11:22:33", "N=2", .3,
  "B", "12:33:44", "N=1", .2,
  "B", "12:33:44", "N=2", .4,
  "B", "12:33:44", "N=3", .6
)
like image 777
JMDR Avatar asked Jun 26 '26 20:06

JMDR


1 Answers

We could use separate, fill

df_have %>%
  separate("n", c("n","time"), sep=" ") %>% 
  fill(time) %>% 
  drop_na() %>% 
  select(site, time, n, val)

Output:

  site  time     n       val
  <chr> <chr>    <chr> <dbl>
1 A     11:22:33 N=1     0.1
2 A     11:22:33 N=2     0.3
3 B     12:33:44 N=1     0.2
4 B     12:33:44 N=2     0.4
5 B     12:33:44 N=3     0.6
like image 171
TarJae Avatar answered Jun 29 '26 13:06

TarJae