I want to remove the negative values from a dataframe and then I need to calculate the mean of each row separately (mean of positive values for each row) I wrote this to remove negative values but it didn't work. I have a warning like that :
Error in
[<-.data.frame
(*tmp*
, i, j, value = NULL) : replacement has length zero
How can I fix this problem?
for (i in 1:1000) {
for(j in 1:20){
if (dframe[i,j]<=0) dframe[i,j]<-NULL
j=j+1
}
i=i+1
}
I want to add that it's not necessary to write a for loop, you can just set:
dframe[dframe < 0] <- NA
As dframe < 0
gives the logical indices TRUE where dframe is less than zero, and can be used to index dframe and replace TRUE values with NA.
@MrFlick explained the use of NA instead of NULL, and how to ignore NA values when calculating means of each row:
rowMeans(dframe, na.rm=TRUE)
Edited to answer question re: rowMeans producing NaNs and how to remove:
NA is "not available" and is a missing value indicator, while NaN is "not a number" which can be produced when the result of an arithmetic operation can't be defined numerically, e.g. 0/0. I can't see your dframe values, but I would guess that this is the result of taking the row means when all row values are NA, while setting na.rm=TRUE. See the difference between mean(c(NA, NA, NA), na.rm=TRUE) vs. mean(c(NA, NA, NA), na.rm=FALSE). You can leave NaN or decide how to define row means when all row values are negative and have been replaced by NA.
To consider only non-NaN values, you can subset for not NaN using !is.nan
, see this example:
mea <- c(2, 4, NaN, 6)
mea
# [1] 2 4 NaN 6
!is.nan(mea) # not NaN, output logical
# [1] TRUE TRUE FALSE TRUE
mea <- mea[!is.nan(mea)]
# [1] 2 4 6
Or you can replace NaN values with some desired value by setting mea[is.nan(mea)] <- ??
An easier way to remove all rows with negative values of your dataframe would be:
df <- df[df > 0]
That way any row with a negative value would cease to be in your dataframe.
It is another way that might help someone.
I had the same problem before, However I decide to use dplyr for this problem.
library("dplyr")
data <- data %>%
filter(column > 0)
rowMeans(data, na.rm = TRUE)
Also I would advice to get both (negative and positive) some times they will be required after for further clarification such is the why are they negative or other cases.
resultPos2 <- result2 %>%# we get the df that is positive
filter(periodBudget > 0)
resultNeg2 <- result2 %>%# we get the df that is negative
filter(periodBudget < 0)
handy for financial cases or data that has been manipulated for other employees
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With