I am creating multiple dataframes, and I want the columns in each of them to be the same type as that specified in a blank dataframe template I have created
For example I have a blank template
template <- data.frame(
  char = character(),
  int = integer(),
  fac1 = factor(levels = c('level1', 'level2', 'level3')),
  fac2 = factor(levels = c('level4', 'level5')),
  stringsAsFactors = FALSE
)
And then I want to create a few dataframes but want to keep the columns in the format of the template (i.e. char to be a character, fac2 to be a factor with two levels 'level4' and 'level5')
df1 <- data.frame(
  char = c('a', 'b'),
  int = c(1,2),
  fac1 = c('level2', 'level1'),
  fac2 = c('level4', 'level4')
)
df2 <- data.frame(
  char = c('c', 'd'),
  int = c(3,4),
  fac1 = c('level3', 'level4'),
  fac2 = c('level5', 'level4')
)
I can obviosuly specify the columns types when I am creating df1 and df2, but I want to avoid having to type out the same thing muliple times, and if for example the levels change in a factor I only want to change it in one place. 
If an value is created in one of the factors which is not a level (e.g. 'level 4' in 'fac1' in 'df2' above, then it should be replaced by NA when converting to the correct format
Maybe you can just post-process your data frame:
df_template <- function(...) {
  df <- data.frame(...)
  df$char <- as.character(df$char)
  df$int  <- as.integer(df$int)
  df$fac1 <- factor(df$fac1, levels = c('level1', 'level2', 'level3'))
  df$fac2 <- factor(df$fac2, levels = c('level4', 'level5'))
  df
}
                        We can create a function that checks the type of each column of the template  and use a as.* function to coerce the corresponding column of the relevant data.frame to the to the relevant type. 
We make an exception for factors (as their type is integer) and we assign the relevant levels to the new modified column.
Map takes the column of template and input by pair, and the output (a list) is then converted to data.frame.
format_df <- function(df,template) {
  as.data.frame(
    Map(function(x,y) {
      if(is.factor(x))
        factor(y,levels(x))
      else
        match.fun(paste0("as.",typeof(x)))(y)
        # or `class<-`(y,class(x)) , same effect for given example
    },template,df),
    stringsAsFactors = FALSE)
}
df1b <- format_df(df1,template)
# char int   fac1   fac2
# 1    a   1 level2 level4
# 2    b   2 level1 level4
str(df1b)
# 'data.frame': 2 obs. of  4 variables:
# $ char: chr  "a" "b"
# $ int : int  1 2
# $ fac1: Factor w/ 3 levels "level1","level2",..: 2 1
# $ fac2: Factor w/ 2 levels "level4","level5": 1 1
Note the level5 in output.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With