I am trying to refactor my code to PEP8 standards for readability but I'm struggling to escape quotes in my SQL queries.
I have 2 queries. The first is a simple SQL query. The second is a Redshift UNLOAD command.
query = '''SELECT * FROM redshift_table
LEFT JOIN
(SELECT DISTINCT * FROM redshift_view) v
ON redshift_table.account_number = v.card_no
WHERE timestamp < date_trunc('day', CURRENT_DATE)
AND timestamp >= (CURRENT_DATE - INTERVAL '1 days')'''
unload = '''UNLOAD ('%s') to '%s'
credentials 'aws_access_key_id=%s;aws_secret_access_key=%s'
delimiter as '%s'parallel off ALLOWOVERWRITE''' % (query, s3_path, access_key, aws_secret, file_delimiter)
Because the sql query is imbedded inside the UNLOAD command, I can only make it work by escaping quotes by prepending them with 3 backslashes: 'day' becomes ///'day///'.
This isn't ideal and I was wondering if there was a way around it.
Any help is greatly appreciated. Thank you.
Masashi's answer won't always work, for example if there are already escaped strings in the query (e.g. in regular expressions).
For a more robust solution, you can take advantage of dollar quoting. The Redshift documentation points to this page in the section "Dollar-quoted String Constants". Instead of escaping quotes manually, you can use dollar signs to quote your query like so:
unload = """UNLOAD ($$ %s $$) ...as""" % (query, ...)
Granted, this will also break if you use $$
anywhere in your query. If you want to be really safe, use a random tag inside the dollar signs:
import uuid
# Leading underscore is necessary since this needs to be a valid
# identifier, i.e. can only start with a letter or underscore.
quote_tag = '_' + uuid.uuid4().hex
unload = 'UNLOAD ($%s$ %s $%s$) ...'.format(quote_tag, query, quote_tag, ...)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With