I am trying to load table from a SQLLite .db file stored at local disk. Is there any clean way to do this in PySpark?
Currently, I am using a solution that works but not as elegant. First I read the table using pandas though sqlite3. One concern is that during the process schema information is not passed (may or may not be a problem). I am wondering whether there is a direct way to load the table without using Pandas.
import sqlite3
import pandas as pd
db_path = 'alocalfile.db'
query = 'SELECT * from ATableToLoad'
conn = sqlite3.connect(db_path)
a_pandas_df = pd.read_sql_query(query, conn)
a_spark_df = SQLContext.createDataFrame(a_pandas_df)
There seems a way using jdbc to do this, but I have not figure out how to use it in PySpark.
So first thing, you would need is to startup pyspark with JDBC driver jar in path Download the sqllite jdbc driver and provide the jar path in below . https://bitbucket.org/xerial/sqlite-jdbc/downloads/sqlite-jdbc-3.8.6.jar
pyspark --conf spark.executor.extraClassPath=<jdbc.jar> --driver-class-path <jdbc.jar> --jars <jdbc.jar> --master <master-URL>
For explaination of above pyspark command, see below post
Apache Spark : JDBC connection not working
Now here is how you would do it:-
Now to read the sqlite database file, simply read it into spark dataframe
df = sqlContext.read.format('jdbc').\
options(url='jdbc:sqlite:Chinook_Sqlite.sqlite',\
dbtable='employee',driver='org.sqlite.JDBC').load()
df.printSchema()
to see your schema.
Full Code:- https://github.com/charles2588/bluemixsparknotebooks/blob/master/Python/sqllite_jdbc_bluemix.ipynb
Thanks, Charles.
Based on @charles gomes
answer:
from pyspark.sql import SparkSession
spark = SparkSession.builder\
.config('spark.jars.packages', 'org.xerial:sqlite-jdbc:3.34.0')\
.getOrCreate()
df = spark.read.format('jdbc') \
.options(driver='org.sqlite.JDBC', dbtable='my_table',
url='jdbc:sqlite:/my/path/alocalfile.db')\
.load()
For other JAR versions please reference the Maven Repository
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With