Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate the running time for spark sql

Tags:

apache-spark

I'm trying to run a couple of spark SQL statements and want to calculate their running time.

One of the solution is to resort to log. I’m wondering is there any other simpler methods to do it. Something like the following:

import time

startTimeQuery = time.clock()
df = sqlContext.sql(query)
df.show()
endTimeQuery = time.clock()
runTimeQuery = endTimeQuery - startTimeQuery
like image 998
Fihop Avatar asked Feb 08 '16 22:02

Fihop


1 Answers

If you're using spark-shell (scala) you could try defining a timing function like this:

def show_timing[T](proc: => T): T = {
    val start=System.nanoTime()
    val res = proc // call the code
    val end = System.nanoTime()
    println("Time elapsed: " + (end-start)/1000 + " microsecs")
    res
}

Then you can try:

val df = show_timing{sqlContext.sql(query)}
like image 80
femibyte Avatar answered Sep 17 '22 06:09

femibyte