Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract rows for the first occurrence of a variable in a data frame

Tags:

r

I have a data frame with two variables, Date and Taxa and want to get the date for the first time each taxa occurs. There are 9 different dates and 40 different taxa in the data frame consisting of 172 rows, but my answer should only have 40 rows.

Taxa is a factor and Date is a date.

For example, my data frame (called 'species') is set up like this:

Date          Taxa 2013-07-12    A 2011-08-31    B 2012-09-06    C 2012-05-17    A 2013-07-12    C 2012-09-07    B 

and I would be looking for an answer like this:

Date          Taxa 2012-05-17    A 2011-08-31    B 2012-09-06    C 

I tried using:

t.first <-  species[unique(species$Taxa),] 

and it gave me the correct number of rows but there were Taxa repeated. If I just use unique(species$Taxa) it appears to give me the right answer, but then I don't know the date when it first occurred.

Thanks for any help.

like image 283
user2614883 Avatar asked Nov 13 '13 02:11

user2614883


1 Answers

t.first <- species[match(unique(species$Taxa), species$Taxa),] 

should give you what you're looking for. match returns indices of the first match in the compared vectors, which give you the rows you need.

like image 68
alexwhan Avatar answered Oct 09 '22 00:10

alexwhan