I'm writing a script to do a copy of some data between two machines on the same network using psycopg2. I'm replacing some old, ugly bash that does the copy with
psql -c -h remote.host "COPY table TO STDOUT" | psql -c "COPY table FROM STDIN"
This seems like both the simplest and most efficient way to do the copy. It's easy to replicate in python with a stringIO or a temp-file, like so:
buf = StringIO()
from_curs = from_conn.cursor()
to_curs = to_conn.cursor()
from_curs.copy_expert("COPY table TO STDOUT", buf)
buf.seek(0, os.SEEK_SET)
to_curs.copy_expert("COPY table FROM STDIN", buf)
...but that involves saving all the data to disk/in memory.
Has anyone figured out a way to mimic the behavior of a Unix pipe in a copy like this? I can't seem to find a unix-pipe object that doesn't involve POpen - Maybe the best solution is to just use POpen and subprocess, after all.
You will have to put one of your calls in a separate thread. I just realized you can use os.pipe(), which makes the rest quite straightforward:
#!/usr/bin/python
import psycopg2
import os
import threading
fromdb = psycopg2.connect("dbname=from_db")
todb = psycopg2.connect("dbname=to_db")
r_fd, w_fd = os.pipe()
def copy_from():
cur = todb.cursor()
cur.copy_from(os.fdopen(r_fd), 'table')
cur.close()
todb.commit()
to_thread = threading.Thread(target=copy_from)
to_thread.start()
cur = fromdb.cursor()
write_f = os.fdopen(w_fd, 'w')
cur.copy_to(write_f, 'table')
write_f.close() # or deadlock...
to_thread.join()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With