Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

removing NaN using dplyr

Tags:

r

dplyr

I am trying to remove the NaN values and sort by the row.names. I tried to do this using dplyr, but my attempt didnt work. Can someone suggest a way to fix it?

require(markovchain)
data1<-data.frame(dv=rep(c("low","high"),3),iv1=sample(c("A","B","C"),replace=T,6))
markov<-markovchainFit(data1)
markovDF<-as(markov, "data.frame")
library(dplyr)
markovDF%>%filter(rowSums>0)%>%arrange(desc(markovDF[,1]))


> markov
$estimate
             A         B         C high low
A          NaN       NaN       NaN  NaN NaN
B          NaN       NaN       NaN  NaN NaN
C          NaN       NaN       NaN  NaN NaN
high 0.3333333 0.0000000 0.6666667    0   0
low  0.6666667 0.3333333 0.0000000    0   0

GOAL:

      A    B  C  high low
high .33 .00 .67  0    0
low  .67 .33  .00 0    0
like image 910
Rilcon42 Avatar asked Dec 03 '15 03:12

Rilcon42


2 Answers

It seems that nelsonauner's answer alters the row.names attribute. Since you want to sort by row.names that seems like an issue.

You don't need dplyr to do this:

library(markovchain)
data1 <- data.frame(dv=rep(c("low","high"),3),iv1=sample(c("A","B","C"),replace=T,6))
markov<-markovchainFit(data1)

#Get into dataframe
markov <- as.data.frame(markov$estimate@transitionMatrix)

#Remove rows that contain nans
markov <- markov[complete.cases(markov), ]

#sort by rowname
markov <- markov[order(row.names(markov)),]

             A         B         C high low
high 0.0000000 0.3333333 0.6666667    0   0
low  0.3333333 0.3333333 0.3333333    0   0
like image 165
Joel Carlson Avatar answered Oct 21 '22 02:10

Joel Carlson


There are two problems to be solved here.

  1. dplyr is meant to operate on dataframes, so we need to get the data into a dataframe. You attempt to do this with markovDF<-as(markov, "data.frame"), but I couldn't get that to work. (Did you get a non-empty dataframe?)

  2. remove rows with an NaN in a specific row (I'll use row A, you can change it to include all rows if you want)

You can solve this problem with

> markov$estimate@transitionMatrix %>% 
    as.data.frame %>% 
    dplyr::filter(!is.na(A)) 
    %>% arrange(-A)


          A         B         C high low
1 0.3333333 0.3333333 0.3333333    0   0
2 0.0000000 0.6666667 0.3333333    0   0
like image 26
rmstmppr Avatar answered Oct 21 '22 04:10

rmstmppr