Reduce a data frame to fewer rows

Question

Let us say I have a data frame "dat" like:

 col1     col2
  12       a
  43       a
  54       a
  11       a
  33       b
  43       b
  34       c
  34       c
  342      c
  343      c

Now I have a vector as

vec <- c(a,a,a,b,c,c)

What I want to do is to remove extra rows in data frame "dat" as per vector "vec" which means in the data frame keep only first 3 rows corresponding to "a", keep only first 1 row corresponding to "b" and keep only first 2 rows corresponding to c.

I should get the output as

 col1     col2
  12       a
  43       a
  54       a
  33       b
  34       c
  34       c

What is the fastest way to do without having to use for loop?

LyzandeR · Accepted Answer

This is a way using split and Map:

Data

dat <- read.table(header=T, text=' col1     col2
  12       a
  43       a
  54       a
  11       a
  33       b
  43       b
  34       c
  34       c
  342      c
  343      c',stringsAsFactors=F)

vec <-  c('a','a','a','b','c','c')

Solution

#count frequencies
tabvec <- table(vec)

data.frame(do.call(rbind,
   #use split to split data.frame according to col2
   #use head to only choose the first n rows according to tabvec
   #convert output into a data.frame
   Map(function(x,y) head(x,y),  split(dat, as.factor(dat$col2)), tabvec)
))

Output:

    col1 col2
a.1   12    a
a.2   43    a
a.3   54    a
b     33    b
c.7   34    c
c.8   34    c

Reduce a data frame to fewer rows

Tags:

dataframe

r

subset

user3664020

1 Answers

LyzandeR

Recent Activity

Donate For Us

Reduce a data frame to fewer rows

Tags:

dataframe

r

subset

user3664020

1 Answers

LyzandeR

Related questions

Recent Activity

Donate For Us