Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Load data with where clause in spark dataframe

I have a oracle table which has n number of records, now i want to load the data from that table with a where/filter condition to spark dataframe. I Do not want to load complete data to a dataframe and then apply filter on it. Is there any option in spark.read.format("jdbc")...etc or any other solution?

like image 745
CUTLER Avatar asked Oct 20 '25 16:10

CUTLER


1 Answers

Check below code. You can write your own query inside query variable. To process or load data parallel you can check for partitionColumn, lowerBound & upperBound columns.

val query = """
  (select columnA,columnB from table_name
    where <where conditions>) table
"""  
val options = Map(
    "url"              -> "<url>".
    "driver"           -> "<driver class>".
    "user"             -> "<user>".
    "password"         -> "<password>".
    "dbtable"          -> query,
    "partitionColumn"  -> "",
    "lowerBound"       -> "<lower bound values>", 
    "upperBound"       -> "<upper bound values>"
)

val df = spark
        .read
        .format("jdbc")
        .options(options)
        .load()

like image 136
Srinivas Avatar answered Oct 23 '25 08:10

Srinivas



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!