Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get rid of rows with duplicate attributes in R

I have a big dataframe with columns such as:

ID, time, OS, IP

Each row of that dataframe corresponds to one entry. Within that dataframe for some IDs several entries (rows) exist. I would like to get rid of those multiple rows (obviously the other attributes will differ for the same ID). Or put different: I only want one single entry (row) for each ID.

When I use unique on the ID column, I only receive the levels (or each unique ID), but I want to keep the other attributes as well. I have tried to use apply(x,2,unique(data$ID)), but this does not work either.

like image 310
CatholicEvangelist Avatar asked May 03 '10 16:05

CatholicEvangelist


People also ask

How do I remove repeated rows in R?

Remove Duplicate rows in R using Dplyr – distinct () function. Distinct function in R is used to remove duplicate rows in R using Dplyr package. Dplyr package in R is provided with distinct() function which eliminate duplicates rows with single variable or with multiple variable.

How do I delete rows with duplicate data?

Select the range you want to remove duplicate rows. If you want to delete all duplicate rows in the worksheet, just hold down Ctrl + A key to select the entire sheet. 2. On Data tab, click Remove Duplicates in the Data Tools group.

How do I remove repeating columns in R?

So, how do you remove duplicate column names in R? The easiest way to remove repeated column names from a data frame is by using the duplicated() function. This function (together with the colnames() function) indicates for each column name if it appears more than once.


1 Answers

subset(data,!duplicated(data$ID))

Should do the trick

like image 161
James Avatar answered Sep 27 '22 23:09

James