I've been reading into DuckDB recently and most of the examples involve having some sort of data already in an R session, then pushing that data into DuckDB.
Here is a basic example of that using the iris dataset;
library("DBI")
con = dbConnect(duckdb::duckdb(), ":memory:")
dbWriteTable(con, "iris_table", iris)
dbGetQuery(con, 'SELECT "Species", MIN("Sepal.Width") FROM iris_table GROUP BY "Species"')
Let's say I have data in a sql server table and want to directly write that data into a duck db.
Is there a way to do this?
if I had a sql query
' SELECT * FROM iris_table "
and wanted to read that directly into DuckDB, how would that work? I haven't seen any examples of this online
I would try to export from SQL Server to Parquet files and then directly query those or import in DuckDB.
In Duckdb you can use as in memory database so there is no direct access to sql database server currently but what you can do is that
first
load the data from sql server into pandas dataframe using Pyodbc and then read the data from the pandas dataframe
pip install pyodbc
pip install duckdb
pip install pandas
import pyodbc
import pasndas as pd
import duckdb
SERVER = '<server-address>'
DATABASE = '<database-name>'
USERNAME = '<username>'
PASSWORD = '<password>'
cnx = f'DRIVER={{ODBC Driver 18 for SQL Server}};SERVER={SERVER};DATABASE={DATABASE};UID={USERNAME};PWD={PASSWORD}'
conn = pyodbc.connect(cnx)
SQL_QUERY = """
SELECT
*
FROM
Sales;
"""
cursor = conn.cursor()
cursor.execute(SQL_QUERY)
records = cursor.fetchall()
After you pull the date from SQL SERVER then load into Pandas dataframe
df = pd.Dataframe.from_records(records)
then read the data from pandas dataframe
duckdb.sql('SELECT * FROM df').fetchall()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With