Not sure why this isn't working but I'm just trying to apply the below but still getting spark's version of the schema for the table (mysql) containing text
instead of varchar(128) I'm trying to specify. Trying to just create custom datatypes for my columns with jdbc write. Trying with spark 2.1.0:
def df2DB(
df: DataFrame,
batchSize: Int,
numPartitions: Int,
database: String,
table: String): Unit = {
val mdb = new MetadataBuilder()
mdb.putString("col1", "INT")
mdb.putString("col2", "VARCHAR(128)")
mdb.putString("col3", "VARCHAR(128)")
val createTableColTypes = mdb.build().json
df.write.format("jdbc")
.option("createTableColumnTypes", createTableColTypes)
.option("url", url)
.option("dbtable", s"${database}.${table}")
.option("user", user)
.option("password", pass)
.option("driver", driver)
.option("batchsize", batchSize)
.option("numPartitions", numPartitions)
.save()
}
I also tried this format without success:
df.write.format("jdbc")
.mode(SaveMode.Overwrite)
.option("url", url)
.option("dbtable", s"${database}.${table}")
.option("user", user)
.option("password", pass)
.option("driver", driver)
.option("batchsize", batchSize)
.option("numPartitions", numPartitions)
.option("createTableColumnTypes", "COL1 INT, COL2 VARCHAR(128)" )
.save()
Also even when I try using createTableOptions
like this I'm getting a sql syntax error. I'm not finding any good examples of using these options either together or separately:
.option("createTableOptions", "CREATE TABLE tbl1 (col1 int, col2 VARCHAR(128))").save()
Instead of creating JSON of your column name to datatype, try using a comma separated list of column name with data type something like this:
def df2DB(
df: DataFrame,
batchSize: Int,
numPartitions: Int,
database: String,
table: String): Unit = {
df.write.format("jdbc")
.option("createTableColumnTypes", "col1 INT, col2 VARCHAR(128), col3 VARCHAR(128)")
.option("url", url)
.option("dbtable", s"${database}.${table}")
.option("user", user)
.option("password", pass)
.option("driver", driver)
.option("batchsize", batchSize)
.option("numPartitions", numPartitions)
.save()
}
Reference: https://github.com/apache/spark/blob/aa4cf2b19e4cf5588af7e2192e0e9f687cd84bc5/examples/src/main/python/sql/datasource.py#L210
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With