Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark doesn't respect the case sensitivity of table

I have a problem with case sensitivity in spark scala. I want to read from a postgres table which contains some character in (uppercase) but by default spark convert the name into lowercase and I receive the error

org.postgresql.util.PSQLException: ERROR: relation "textlogs" does not exist

val opts = Map(
  "url" -> "jdbc:postgresql://localhost:5433/sparkdb",
  "dbtable" -> "TextLogs",
  "user" -> "admin",
  "password" -> "mypassword"
)
val df = spark
           .read
           .format("jdbc")
           .options(opts)
           .load

Is there a way to force spark to respect the case sensitivity ?

like image 494
MrGildarts Avatar asked Jan 02 '23 21:01

MrGildarts


1 Answers

In Postgres, when you don't double quote object identifiers (like table name), they are treated as case insensitive. So this TextLogs actually equals to textlogs.

In order to have case sensitive object identifier, you need to double quote it. In your case, that would be "TextLogs", so in your code you should just add escaped double quotes to table name:

val opts = Map(
  "url" -> "jdbc:postgresql://localhost:5433/sparkdb",
  "dbtable" -> "\"TextLogs\"",
  "user" -> "admin",
  "password" -> "mypassword"
)
val df = spark
           .read
           .format("jdbc")
           .options(opts)
           .load
like image 123
Łukasz Kamiński Avatar answered Jan 11 '23 20:01

Łukasz Kamiński