Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract specific rows in R?

Tags:

r

I would like to extract specific rows from a dataframe into a new dataframe using R. I have two columns: City and Household. In order to detect move, I want a new dataframe with the households who have not the same city.

For example, if a household appears 3 times with at least one city differents from the others, I keep it. Otherwise, I delete the 3 rows of this household.

    City      Household
   Paris              A
   Paris              A
    Nice              A
  Limoge              B
  Limoge              B
Toulouse              C
   Paris              C

Here, I want to keep only Household A and Household C.

like image 453
Marie Avatar asked Sep 27 '22 19:09

Marie


1 Answers

A dplyr solution : compute the length of unique cities for each household and keep only those with length > 1

library(dplyr)
df <- data.frame(city=c("Paris","Paris","Nice","Limoge","Limoge","Toulouse","Paris"),
                 household =c(rep("A",3),rep("B",2),rep("C",2)))

new_df <- df %>% group_by(household) %>%
  filter(n_distinct(city) > 1)

Source: local data frame [5 x 2]
Groups: household

      city household
1    Paris         A
2    Paris         A
3     Nice         A
4 Toulouse         C
5    Paris         C

Edit : added @shadow and @davidarenburg suggestions from the comments

like image 178
scoa Avatar answered Oct 25 '22 04:10

scoa