I have a large dataframe that I would like to use the excellent package dplyr
(Wickham) which I just recently discovered. I would like to filter out columns that contain characters. Is this possible?
For example, in the flights
datasets within the nycflights13
package, how could I filter out the columns that have class character
?
library(nycflights13)
data(flights)
str(flights)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 336776 obs. of 16 variables:
$ year : int 2013 2013 2013 2013 2013 2013 2013 2013 2013 2013 ...
$ month : int 1 1 1 1 1 1 1 1 1 1 ...
$ day : int 1 1 1 1 1 1 1 1 1 1 ...
$ dep_time : int 517 533 542 544 554 554 555 557 557 558 ...
$ dep_delay: num 2 4 2 -1 -6 -4 -5 -3 -3 -2 ...
$ arr_time : int 830 850 923 1004 812 740 913 709 838 753 ...
$ arr_delay: num 11 20 33 -18 -25 12 19 -14 -8 8 ...
$ carrier : chr "UA" "UA" "AA" "B6" ...
$ tailnum : chr "N14228" "N24211" "N619AA" "N804JB" ...
$ flight : int 1545 1714 1141 725 461 1696 507 5708 79 301 ...
$ origin : chr "EWR" "LGA" "JFK" "JFK" ...
$ dest : chr "IAH" "IAH" "MIA" "BQN" ...
$ air_time : num 227 227 160 183 116 150 158 53 140 138 ...
$ distance : num 1400 1416 1089 1576 762 ...
$ hour : num 5 5 5 5 5 5 5 5 5 5 ...
$ minute : num 17 33 42 44 54 54 55 57 57 58 ...
Any ideas?
In this, first, pass your dataframe object to the filter function, then in the condition parameter write the column name in which you want to filter multiple values then put the %in% operator, and then pass a vector containing all the string values which you want in the result.
The filter() method in R is used to subset a data frame based on a provided condition. If a row satisfies the condition, it must produce TRUE . Otherwise, non-satisfying rows will return NA values. Hence, the row will be dropped.
Here is a dplyr
/tidyverse
option using select_if()
(using the dplyr starwars sample data):
starwars %>%
select_if(~!is.character(.)) %>%
head(2)
# A tibble: 2 x 6
height mass birth_year films vehicles starships
<int> <dbl> <dbl> <list> <list> <list>
1 172 77 19 <chr [5]> <chr [2]> <chr [2]>
2 167 75 112 <chr [6]> <chr [0]> <chr [0]>
You don't need dplyr
for that, you can use base R:
flights[, !sapply(flights, is.character)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With