Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to remove the negative values from a data frame in R

I want to remove the negative values from a dataframe and then I need to calculate the mean of each row separately (mean of positive values for each row) I wrote this to remove negative values but it didn't work. I have a warning like that :

Error in [<-.data.frame(*tmp*, i, j, value = NULL) : replacement has length zero

How can I fix this problem?

for (i in 1:1000) {
  for(j in 1:20){
     if (dframe[i,j]<=0) dframe[i,j]<-NULL
     j=j+1
  }
  i=i+1
}
like image 927
cocomat Avatar asked Mar 15 '17 22:03

cocomat


3 Answers

I want to add that it's not necessary to write a for loop, you can just set:

dframe[dframe < 0] <- NA

As dframe < 0 gives the logical indices TRUE where dframe is less than zero, and can be used to index dframe and replace TRUE values with NA.

@MrFlick explained the use of NA instead of NULL, and how to ignore NA values when calculating means of each row:

rowMeans(dframe, na.rm=TRUE) 

Edited to answer question re: rowMeans producing NaNs and how to remove:

NA is "not available" and is a missing value indicator, while NaN is "not a number" which can be produced when the result of an arithmetic operation can't be defined numerically, e.g. 0/0. I can't see your dframe values, but I would guess that this is the result of taking the row means when all row values are NA, while setting na.rm=TRUE. See the difference between mean(c(NA, NA, NA), na.rm=TRUE) vs. mean(c(NA, NA, NA), na.rm=FALSE). You can leave NaN or decide how to define row means when all row values are negative and have been replaced by NA.

To consider only non-NaN values, you can subset for not NaN using !is.nan, see this example:

mea <- c(2, 4, NaN, 6)
mea
# [1]   2   4 NaN   6
!is.nan(mea) # not NaN, output logical
# [1]  TRUE  TRUE FALSE  TRUE 
mea <- mea[!is.nan(mea)]
# [1] 2 4 6

Or you can replace NaN values with some desired value by setting mea[is.nan(mea)] <- ??

like image 55
Djork Avatar answered Nov 22 '22 08:11

Djork


An easier way to remove all rows with negative values of your dataframe would be:

df <- df[df > 0]

That way any row with a negative value would cease to be in your dataframe.

like image 36
Antonio López Ruiz Avatar answered Nov 22 '22 10:11

Antonio López Ruiz


It is another way that might help someone.

I had the same problem before, However I decide to use dplyr for this problem.

    library("dplyr")

       data <- data %>%
            filter(column > 0)

 rowMeans(data, na.rm = TRUE)

Also I would advice to get both (negative and positive) some times they will be required after for further clarification such is the why are they negative or other cases.

resultPos2 <- result2 %>%# we get the df that is positive
    filter(periodBudget > 0)

resultNeg2 <- result2 %>%# we get the df that is negative
    filter(periodBudget < 0)

this make it easier to hand out to other people and check for errors if required or reasons that why is negative.

handy for financial cases or data that has been manipulated for other employees

like image 44
luis vergara Avatar answered Nov 22 '22 10:11

luis vergara