Apache Spark: In SparkSql, are sql's vulnerable to Sql Injection [duplicate]

Question

Scenario:

Say there is a table in Hive, and it is queried using the below SparkSql in Apache Spark, where table name is passed as an argument and concatenated to the query.

In case of non-distributed system, I have basic understanding of SQL-Injection vulnerability and in the context of JDBC understand the usage of createStatement/preparedStatement in the those kind of scenario.

But what about this scenario in the case of sparksql, is this code vulnerable? Any insights ?

def main(args: Array[String]) {

    val sconf = new SparkConf().setAppName("TestApp")
    val sparkContext = new SparkContext(sconf)
    val hiveSqlContext = new org.apache.spark.sql.hive.HiveContext(sparkContext)

    val tableName = args(0)    // passed as an argument

    val tableData  =  hiveSqlContext.sql("select IdNUm, Name from hiveSchemaName." + tableName + " where IdNum <> '' ")
                                        .map( x => (x.getString(0), x.getString(1)) ).collectAsMap()


    ................
    ...............

}

Gsquare · Accepted Answer

You can try the following in Spark 2.0:

def main(args: Array[String]) {
val conf = new SparkConf()

val sparkSession = SparkSession
  .builder()
  .appName("TestApp")
  .config(conf)
  .enableHiveSupport()
  .getOrCreate()

val tableName = args(0)    // passed as an argument

val tableData  =  sparkSession
.table(tableName)
.select($"IdNum", $"Name")
.filter($"IdNum" =!= "")
.map( x => (x.getString(0), x.getString(1)) ).collectAsMap()


................
...............

}`

Tal Joffe · Answer

In Java usually the most common way to handle sql injection threats is to use prepared statements.

you can use the Java libraries or google prepared statements in Scala to look for Scala libraries for that. Since Scala is used also in web applications I'm sure such libraries exists..

Apache Spark: In SparkSql, are sql's vulnerable to Sql Injection [duplicate]

Tags:

apache-spark

apache-spark-sql

hadoop

hive

bigdata

jdprasad

2 Answers

Gsquare

Tal Joffe

Recent Activity

Donate For Us

Apache Spark: In SparkSql, are sql's vulnerable to Sql Injection [duplicate]

Tags:

apache-spark

apache-spark-sql

hadoop

hive

bigdata

jdprasad

2 Answers

Gsquare

Tal Joffe

Related questions

Recent Activity

Donate For Us