Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Determine if data frame is empty

Tags:

dataframe

r

I have a data frame and I would like to test really fast if it is empty or not. I know that there are either no rows or there are integers (no missing values). So far, I have tested five different options (see below). Does anyone have even faster solution?

df <- data.frame(a = integer(0), b = integer(0), c = integer(0))

fa <- function(){
  nrow(df) > 0
}

fb <- function(){
  any(dim(df)[1L])
}

fc <- function(){
  (dim(df)[1L]) != 0
}

fd <- function() {
  any(.subset2(df, 1)[1])
}

fe <- function() {
  any(.subset2(df, 1))
}

library(microbenchmark)
microbenchmark(fa(), fb(), fc(), fd(), fe(), times = 1000)

And results:

> microbenchmark(fa(), fb(), fc(), fd(), fe(), times = 1000)
Unit: nanoseconds
 expr  min   lq     mean median    uq   max neval  cld
 fa() 5664 6725 8672.462   6725 11680 47777  1000   cd
 fb() 6017 7078 8979.645   7079 12034 58041  1000    d
 fc() 6017 6372 8492.680   6725 11679 25127  1000   c 
 fd() 1062 1770 2214.170   1771  2832 14511  1000  b  
 fe()  354 1062 1359.498   1063  1770 12741  1000 a   
like image 437
Tomas Greif Avatar asked Jan 23 '15 14:01

Tomas Greif


People also ask

How check if DataFrame is empty?

You can use the attribute df. empty to check whether it's empty or not: if df. empty: print('DataFrame is empty!

Is DataFrame empty?

empty. True if NDFrame is entirely empty [no items], meaning any of the axes are of length 0. If NDFrame contains only NaNs, it is still not considered empty.

How do you check whether a pandas DataFrame is not empty?

To check if DataFrame is empty in Pandas, use pandas. DataFrame. empty attribute. This attribute returns a boolean value of true if this DataFrame is empty, or false if this DataFrame is not empty.

How do you know if DF is none?

Use DataFrame. isnull(). Values. any() method to check if there are any missing data in pandas DataFrame, missing data is represented as NaN or None values in DataFrame.


1 Answers

Since most of the objects you tests aren't likely to be empty, you should be more concerned about the timing of your functions on a non-empty data.frame. You should also compile them to get a sense for how they would perform in a package.

library(microbenchmark)
library(compiler)

fa <- cmpfun({function(){
  nrow(df) > 0L
}})

fb <- cmpfun({function(){
  any(dim(df)[1L])
}})

fc <- cmpfun({function(){
  dim(df)[1L] != 0L
}})

fd <- cmpfun({function() {
  any(.subset2(df, 1L)[1L])
}})

fe <- cmpfun({function() {
  any(.subset2(df, 1L))
}})

ff <- cmpfun({function() {
  length(.subset2(df, 1L)) > 0L
}})

fg <- cmpfun({function() {
  as.logical(length(.subset2(df, 1L)))
}})

The test on an empty data.frame shows all methods are roughly the same.

df <- data.frame(a = integer(0), b = integer(0), c = integer(0))
microbenchmark(fa(), fb(), fc(), fd(), fe(), ff(), fg(), times = 1000)

# Unit: nanoseconds
#  expr  min     lq median     uq   max neval
#  fa() 5685 5969.0 6165.0 6608.5 20515  1000
#  fb() 6147 6443.0 6651.0 7214.0 18117  1000
#  fc() 5726 5984.0 6152.0 6457.5 38404  1000
#  fd() 1210 1411.0 1573.0 1764.5  4933  1000
#  fe()  635  871.0 1003.0 1105.5 10225  1000
#  ff()  513  727.5  861.5  941.0  5691  1000
#  fg()  681  868.5  981.5 1080.0  2982  1000

The test on a non-empty data.frame shows that one of the functions is a really bad performer, while the rest are roughly the same.

df <- data.frame(a = integer(1e6), b = integer(1e6), c = integer(1e6))
microbenchmark(fa(), fb(), fc(), fd(), fe(), ff(), fg(), times = 1000)

# Unit: nanoseconds
#  expr     min      lq    median        uq      max neval
#  fa()    6569    7142    8782.0   12364.5    46749  1000
#  fb()    7034    7682    9334.5   18334.0    53172  1000
#  fc()    6539    7110    8453.5   20585.5    49912  1000
#  fd()    1171    1585    2507.5    5021.5    17641  1000
#  fe() 4340209 4413042 4460973.5 5468688.5 26045766  1000
#  ff()     637     984    1489.0    3646.5    14212  1000
#  fg()     767    1161    2401.0    4078.5   236958  1000
like image 195
Joshua Ulrich Avatar answered Sep 21 '22 08:09

Joshua Ulrich