I have a requirement in my project where i am implementing SCD type 2 table in Azure SQL DW. I am able to insert new records using JDBC connector but i need to update old records as "expired" and update other records as per updated values.
The recommended pattern is to either use the Databricks Azure SQL DW Connector to bulk load a staging table, or write the new data to files in blob storage, or datalake and use a Polybase External Table to hold the new rows.
Then, once the new data is either loaded into a staging table, or available in an Polybase External Table, write a TSQL stored procedure to "update old records as "expired" and update other records as per updated values".
Spark only knows how to run queries and load tables. But you've got the JDBC driver installed, and can use JDBC access from Scala or Java. EG:
%scala
import java.util.Properties
import java.sql.DriverManager
val jdbcUsername = dbutils.secrets.get(scope = "kv", key = "sqluser")
val jdbcPassword = dbutils.secrets.get(scope = "kv", key = "sqlpassword")
val driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
// Create the JDBC URL without passing in the user and password parameters.
val jdbcUrl = s"jdbc:sqlserver://xxxxx.database.windows.net:1433;database=AdventureWorks;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;"
val connection = DriverManager.getConnection(jdbcUrl, jdbcUsername, jdbcPassword)
val stmt = connection.createStatement()
val sql = """
exec usp_someproc ...
"""
stmt.execute(sql)
connection.close()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With