I am using the sql package in a Jupyter notebook and I understand how to use variables in my query:
client = "Disney"
queryid = %sql SELECT * FROM mytable WHERE name = :client
What I don't understand is how to pass a list to my query, like:
clients = ["Disney", "Netflix", "Sky"]
queryid = %sql SELECT * FROM mytable WHERE name in (:clients)
This raises an error which states that my SQL is wrong. What is the way to handle lists in this setting?
With a demo case for sqlite3:
In [1]: import sqlite3
In [2]: conn = sqlite3.connect('example.db')
In [3]: c = conn.cursor()
In [4]: c.execute('''CREATE TABLE stocks
...: (date text, trans text, symbol text, qty real, price real)''')
...:
...: # Insert a row of data
...: c.execute("INSERT INTO stocks VALUES ('2006-01-05','BUY','RHAT',100,35.1
...: 4)")
...:
...: # Save (commit) the changes
...: conn.commit()
...:
In [5]: # Larger example that inserts many records at a time
...: purchases = [('2006-03-28', 'BUY', 'IBM', 1000, 45.00),
...: ('2006-04-05', 'BUY', 'MSFT', 1000, 72.00),
...: ('2006-04-06', 'SELL', 'IBM', 500, 53.00),
...: ]
...: c.executemany('INSERT INTO stocks VALUES (?,?,?,?,?)', purchases)
I can fetch values that match several strings with:
In [31]: c.execute('SELECT * FROM stocks WHERE symbol IN (?,?)',('IBM','RHAT'))
Out[31]: <sqlite3.Cursor at 0xaf703fa0>
In [32]: c.fetchall()
Out[32]:
[('2006-01-05', 'BUY', 'RHAT', 100.0, 35.14),
('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),
('2006-04-06', 'SELL', 'IBM', 500.0, 53.0)]
Or with the generalized solution from Parameter substitution for a SQLite "IN" clause
In [33]: alist=['RHAT','IBM']
In [34]: c.execute('SELECT * FROM stocks WHERE symbol IN (%s)' %
...: ','.join('?'*len(alist)),
...: alist)
...:
Out[34]: <sqlite3.Cursor at 0xaf703fa0>
In [35]: c.fetchall()
Out[35]:
[('2006-01-05', 'BUY', 'RHAT', 100.0, 35.14),
('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),
('2006-04-06', 'SELL', 'IBM', 500.0, 53.0)]
c.execute('SELECT * FROM stocks WHERE symbol IN (:1,:2)',alist), and possibly other forms.
see also:
sqlite3 "IN" clause
I assume MYSQL and %sql iterface behaves the same; but I don't have those installed.
With proper quoting literals also work (again sqlite3)
c.execute('SELECT * FROM stocks WHERE symbol IN ("IBM","RHAT")')
or
In [80]: 'SELECT * FROM stocks WHERE symbol IN (%s)'%','.join('"%s"'%x for x in alist)
Out[80]: 'SELECT * FROM stocks WHERE symbol IN ("RHAT","IBM")'
In [81]: c.execute(_)
So I'm guessing that:
%sql SELECT * FROM stocks WHERE symbol IN ("IBM","RHAT")
would work even if some form of (:....) does not.
I installed %sql
In [5]: %%sql
...: sqlite:///example.db
...:
Out[5]: 'Connected: [email protected]'
In [7]: %sql SELECT * from stocks
Done.
Out[7]:
[('2006-01-05', 'BUY', 'RHAT', 100.0, 35.14),
('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),
('2006-04-05', 'BUY', 'MSFT', 1000.0, 72.0),
('2006-04-06', 'SELL', 'IBM', 500.0, 53.0)]
In [9]: %sql SELECT * from stocks where symbol in ('IBM')
Done.
Out[9]:
[('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),
('2006-04-06', 'SELL', 'IBM', 500.0, 53.0)]
In [10]: %sql SELECT * from stocks where symbol in ('IBM','RHAT')
Done.
Out[10]:
[('2006-01-05', 'BUY', 'RHAT', 100.0, 35.14),
('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),
('2006-04-06', 'SELL', 'IBM', 500.0, 53.0)]
The string formatting approach works:
In [11]: alist=['RHAT','IBM']
In [12]: cmd='SELECT * FROM stocks WHERE symbol IN (%s)'%','.join('"%s"'%x for x
...: in alist)
In [13]: cmd
Out[13]: 'SELECT * FROM stocks WHERE symbol IN ("RHAT","IBM")'
In [14]: %sql $cmd
Done.
Out[14]:
[('2006-01-05', 'BUY', 'RHAT', 100.0, 35.14),
('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),
('2006-04-06', 'SELL', 'IBM', 500.0, 53.0)]
The : syntax isn't well documented. It isn't clear who's implementing it. ($ is standard Ipython variable substitution).
In [18]: sym='IBM'
In [19]: %sql SELECT * from stocks where symbol in (:sym)
Done.
Out[19]:
[('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0),
('2006-04-06', 'SELL', 'IBM', 500.0, 53.0)]
symbol in (:sym1,:sym2) works
So far I don't see evidence that %sql works with the conventional SQL placeholder syntax.
Looks like you (?) submitted and closed a issue on github, https://github.com/catherinedevlin/ipython-sql/issues/92
Adapting that solution to quote strings:
In [74]: mystring = '({})'.format(','.join('"{}"'.format(e) for e in alist))
In [75]: mystring
Out[75]: '("RHAT","IBM")'
In [76]: %sql SELECT * from stocks where symbol in $mystring
Done.
In other words, use the ipython $ injection as opposed to the : form.
Looking at the ipython-sql source code:
ipython-sql/blob/master/src/sql/run.py
def run(conn, sql, config, user_namespace):
...
txt = sqlalchemy.sql.text(statement)
result = conn.session.execute(txt, user_namespace)
It looks like the :name syntax is a sqlalchemy bind parameter, and is handled with sqlalchemy.sql.text and sqlalchemy.sql.bindparam
(http://docs.sqlalchemy.org/en/latest/orm/tutorial.html#orm-tutorial-literal-sql)
This error indicates that each bindparameter is translated into a ? placeholder, plus as matching parameters entry:
In [96]: %sql SELECT * from stocks where symbol in :mystring
(sqlite3.OperationalError) near "?": syntax error [SQL: 'SELECT * from stocks where symbol in ?'] [parameters: ('("RHAT","IBM")',)]
So my original solution of generating IN (?,?,...) to match the length of the list is the right SQL, even though it does not work with sqlalchemy and %sql.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With