Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Expand dataframe based on new column

Tags:

dataframe

r

I have the following dataframe:

df <- data.frame(
     month=c("July", "August", "August"),
     day=c(31, 1, 2),
     time=c(12, 12, 12))

   month day time
1   July  31   12
2 August   1   12
3 August   2   12

I have a text file of times (in decimal format) and I want to replace the "time" column with all of the times from the text file. There are multiple days in the text file, with over 300 records for each.

7-31-2016 #the days are all concatenated together, this line represents the beginning of one day (July 31)
13.12344
13.66445
13.76892
...
8-1-2016 #here is another day (August 1)
14.50333
14.52000
14.53639
...

However, the text file is much longer than the current dataframe-- it has 393 records. So I would like the resulting dataframe to look something like this:

    month   day       time
5    July    31   13.12344
6    July    31   13.66445
7    July    31   13.76892
.....
393 August    1   14.50333
394 August    1   14.52000
394 August    1   14.53639

Basically I just need to be able to expand my current dataframe to match the number of records from the new file, while keeping the same day. Hope that makes sense.

like image 397
ale19 Avatar asked May 21 '26 15:05

ale19


1 Answers

# Create txt data
txt <- data.frame(x = c('7-31-2016', '13.12344', '13.66445', '13.76892', '8-1-2016', '14.50333', '14.52000', '14.53639'))
# Load Your data 
df <- data.frame(
  month=c("July", "August", "August"),
  day=c(31, 1, 2),
  time=c(12, 12, 12))

# Need a year to join dates
df$year <- 2016

# Create date column
df$date <- as.Date(paste0(df$month, "/", df$day, "/", df$year), format = "%B/%d/%Y")

# Find values with dashes, then replaces with /
txt$dash <- grepl('-', txt$x)
txt$x <- gsub("-", "/", txt$x)

# Adds new columns
library(dplyr)
txt <- mutate(txt, date = ifelse(dash==TRUE, as.Date(x, format = "%m/%d/%Y"), NA))
txt <- mutate(txt, time = ifelse(dash==FALSE, as.numeric(x), NA))

# Fill down values
library(zoo)
txt$date <- na.locf(txt$date)

# Removes NA and keeps necessary columns
txt <- txt[!is.na(txt$time),]
txt <- txt[c("date", "time")]

# Merge
output <- merge(df, txt, by = "date")
like image 116
pyll Avatar answered May 24 '26 08:05

pyll