I have a dataframe (say x) in R:
> x
Height  Weight Gender
5     60    m
5     70    m
6     80    m
4     90    m
4     60    m
5     70    f
5     80    f
6     60    f
4     90    f
4     60    f
I need an R code that will produce a new dataframe, say y, that takes the subset of X by Gender and only the first three rows of each gender (1:3) to give the result as follows.
>y
Height  Weight Gender
5       60      m
5       70      m
6       80      m
5       70      f
5       80      f
6       60      f
                By using bracket notation on R DataFrame (data.name) we can select rows by column value, by index, by name, by condition e.t.c. You can also use the R base function subset() to get the same results. Besides these, R also provides another function dplyr::filter() to get the rows from the DataFrame.
How to subset the data frame (DataFrame) by column value and name in R? By using R base df[] notation, or subset() you can easily subset the R Data Frame (data. frame) by column value or by column name.
Select Rows by list of Column Values. By using the same notation you can also use an operator %in% to select the DataFrame rows based on a list of values. The following example returns all rows when state values are present in vector values c('CA','AZ','PH') .
The filter() function is used to subset a data frame, retaining all rows that satisfy your conditions.
Try slice from dplyr
library(dplyr)
x %>%
    group_by(Gender) %>% 
    slice(1:3)
Or using data.table
library(data.table)
setDT(x)[,.SD[1:3] , Gender]
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With