Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use dplyr to filter out columns containing characters

Tags:

r

dplyr

I have a large dataframe that I would like to use the excellent package dplyr (Wickham) which I just recently discovered. I would like to filter out columns that contain characters. Is this possible?

For example, in the flights datasets within the nycflights13 package, how could I filter out the columns that have class character?

library(nycflights13)
data(flights)
str(flights)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   336776 obs. of  16 variables:
 $ year     : int  2013 2013 2013 2013 2013 2013 2013 2013 2013 2013 ...
 $ month    : int  1 1 1 1 1 1 1 1 1 1 ...
 $ day      : int  1 1 1 1 1 1 1 1 1 1 ...
 $ dep_time : int  517 533 542 544 554 554 555 557 557 558 ...
 $ dep_delay: num  2 4 2 -1 -6 -4 -5 -3 -3 -2 ...
 $ arr_time : int  830 850 923 1004 812 740 913 709 838 753 ...
 $ arr_delay: num  11 20 33 -18 -25 12 19 -14 -8 8 ...
 $ carrier  : chr  "UA" "UA" "AA" "B6" ...
 $ tailnum  : chr  "N14228" "N24211" "N619AA" "N804JB" ...
 $ flight   : int  1545 1714 1141 725 461 1696 507 5708 79 301 ...
 $ origin   : chr  "EWR" "LGA" "JFK" "JFK" ...
 $ dest     : chr  "IAH" "IAH" "MIA" "BQN" ...
 $ air_time : num  227 227 160 183 116 150 158 53 140 138 ...
 $ distance : num  1400 1416 1089 1576 762 ...
 $ hour     : num  5 5 5 5 5 5 5 5 5 5 ...
 $ minute   : num  17 33 42 44 54 54 55 57 57 58 ...

Any ideas?

like image 851
jonas Avatar asked Dec 04 '14 08:12

jonas


People also ask

How do I filter multiple values in R dplyr?

In this, first, pass your dataframe object to the filter function, then in the condition parameter write the column name in which you want to filter multiple values then put the %in% operator, and then pass a vector containing all the string values which you want in the result.

What does filter () do in R?

The filter() method in R is used to subset a data frame based on a provided condition. If a row satisfies the condition, it must produce TRUE . Otherwise, non-satisfying rows will return NA values. Hence, the row will be dropped.


Video Answer


2 Answers

Here is a dplyr/tidyverse option using select_if() (using the dplyr starwars sample data):

starwars %>% 
  select_if(~!is.character(.)) %>% 
  head(2)

# A tibble: 2 x 6
    height  mass birth_year films     vehicles  starships
     <int> <dbl>      <dbl> <list>    <list>    <list>   
  1    172    77         19 <chr [5]> <chr [2]> <chr [2]>
  2    167    75        112 <chr [6]> <chr [0]> <chr [0]>

like image 175
sbha Avatar answered Sep 23 '22 13:09

sbha


You don't need dplyr for that, you can use base R:

flights[, !sapply(flights, is.character)]
like image 25
Tim Avatar answered Sep 21 '22 13:09

Tim