I have a dataframe (say x) in R:
> x
Height Weight Gender
5 60 m
5 70 m
6 80 m
4 90 m
4 60 m
5 70 f
5 80 f
6 60 f
4 90 f
4 60 f
I need an R code that will produce a new dataframe, say y, that takes the subset of X by Gender and only the first three rows of each gender (1:3) to give the result as follows.
>y
Height Weight Gender
5 60 m
5 70 m
6 80 m
5 70 f
5 80 f
6 60 f
By using bracket notation on R DataFrame (data.name) we can select rows by column value, by index, by name, by condition e.t.c. You can also use the R base function subset() to get the same results. Besides these, R also provides another function dplyr::filter() to get the rows from the DataFrame.
How to subset the data frame (DataFrame) by column value and name in R? By using R base df[] notation, or subset() you can easily subset the R Data Frame (data. frame) by column value or by column name.
Select Rows by list of Column Values. By using the same notation you can also use an operator %in% to select the DataFrame rows based on a list of values. The following example returns all rows when state values are present in vector values c('CA','AZ','PH') .
The filter() function is used to subset a data frame, retaining all rows that satisfy your conditions.
Try slice
from dplyr
library(dplyr)
x %>%
group_by(Gender) %>%
slice(1:3)
Or using data.table
library(data.table)
setDT(x)[,.SD[1:3] , Gender]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With