Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing rows from R data frame

Tags:

I have the following data frame:

> str(df) 'data.frame':   3149 obs. of  9 variables:  $ mkod : int  5029 5035 5036 5042 5048 5050 5065 5071 5072 5075 ...  $ mad  : Factor w/ 65 levels "Akgün Kasetçilik         ",..: 58 29 59 40 56 11 33 34 19 20 ...  $ yad  : Factor w/ 44 levels "BAKUGAN","BARBIE",..: 1 1 1 1 1 1 1 1 1 1 ...  $ donem: int  201101 201101 201101 201101 201101 201101 201101 201101 201101 201101 ...  $ sayi : int  201101 201101 201101 201101 201101 201101 201101 201101 201101 201101 ...  $ plan : int  2 2 3 2 2 2 7 3 2 7 ...  $ sevk : int  2 2 3 2 2 2 6 3 2 7 ...  $ iade : int  0 0 3 1 2 2 6 2 2 3 ...  $ satis: int  2 2 0 1 0 0 0 1 0 4 ... 

I want to remove 21 specific rows from this data frame.

> a <- df[df$plan==0 & df$sevk==0,] > nrow(a) [1] 21 

So when I remove those 21 rows, I will have a new data frame with 3149 - 21 = 3128 rows. I found the following solution:

> b <- df[df$plan!=0 | df$sevk!=0,] > nrow(b) [1] 3128 

My above solution uses a modified logical expression (!= instead of == and | instead of &). Other than modifying the original logical expression, how can I obtain the new data frame without those 21 rows? I need something like that:

> df[-a,] #does not work 

EDIT (especially for the downvoters, I hope they understand why I need an alternative solution): I asked for a different solution because I'm writing a long code, and there are various variable assignments (like a's in my example) in various parts of my code. So, when I need to remove rows in advancing parts of my code, I don't want to go back and try to write the inverse of the logical expressions inside a-like expressions. That's why df[-a,] is more usable for me.

like image 711
Mehper C. Palavuzlar Avatar asked Oct 27 '11 11:10

Mehper C. Palavuzlar


People also ask

How do I remove rows from a Dataframe?

To drop a row or column in a dataframe, you need to use the drop() method available in the dataframe. You can read more about the drop() method in the docs here. Rows are labelled using the index number starting with 0, by default. Columns are labelled using names.

How do I remove rows without values in R?

To remove all rows having NA, we can use na. omit function. For Example, if we have a data frame called df that contains some NA values then we can remove all rows that contains at least one NA by using the command na. omit(df).


2 Answers

Just negate your logical subscript:

a <- df[!(df$plan==0 & df$sevk==0),] 
like image 66
Joshua Ulrich Avatar answered Sep 19 '22 14:09

Joshua Ulrich


You can use the rownames to specify a "complementary" dataframe. Its easier if they are numerical rownames:

df[-as.numeric(rownames(a)),] 

But more generally you can use:

df[setdiff(rownames(df),rownames(a)),] 
like image 37
James Avatar answered Sep 17 '22 14:09

James