Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to return the col type of a R tibble in compact string representation?

Tags:

r

tibble

For example I have a tibble like this. test <- tibble(a = 10, b = "a")

with this input, I want a function that can return "dc" which represent double and character.

The reason I ask this is that I want to read in lots of files. and I don't want to let read_table function to decide the type for each columns. I can specific the string manually, but since the actually data I want to import have 50 columns, it is quite hard to do manually.

Thanks.

like image 779
wei Avatar asked Oct 30 '22 06:10

wei


1 Answers

While the aforementioned test %>% summarise_all(class) will give you the class names of the columns it does so in a long form, whereas in this problem you to convert them to single character codes that mean something to read_table col_types. To map from class names to single letter codes you can use a lookup table, here's an (incomplete) example with dput:

structure(list(col_type = c("character", "integer", "numeric", 
"double", "logical"), code = c("c", "i", "n", "d", "l")), .Names = c("col_type", 
"code"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-5L))

Now using this table, I'll call it types, we can finally transform the column types in a single string:

library(dplyr)
library(tidyr)
library(stringr)

test %>% 
  summarise_all(class) %>% 
  gather(col_name, col_type) %>% 
  left_join(types) %>% 
  summarise(col_types = str_c(code, collapse = "")) %>% 
  unlist(use.names = FALSE)

This gets the class for each column (summarise_all) then gathers them into a tibble matching the column name with the column type (gather). The left_join matches on the col_type column and gives the short 1-char code for each column name. Now we don't do anything with the column names, so it's fine to just concatenate with a summarise and str_c. Finally unlist pulls the string out of a tibble.

like image 148
beigel Avatar answered Nov 15 '22 05:11

beigel