Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python subprocess and mysqldump

I know parts of this question have been asked before, but I have some related questions.

I'm trying to execute

mysqldump -u uname -ppassword --add-drop-database --databases databaseName | gzip > fileName

I'm potentially dumping a very large (200GB?) db. Is that in itself a dumb thing to do? I then want to send the zipped file over the network for storage, delete the local dump, and purge a couple of tables.

Anyway, I was using subprocess like this, because there doesn't seem to be a way to execute the entire original call without subprocess considering | to be a table name.:

from subprocess import Popen, PIPE

f = open(FILENAME, 'wb')
args = ['mysqldump', '-u', 'UNAME', '-pPASSWORD', '--add-drop-database', '--databases', 'DB']

p1 = Popen(args, stdout=PIPE)
P2 = Popen('gzip', stdin=p1.stdout, stdout=f)
p2.communicate()

but then I read that communicate caches the data in memory, which wouldn't work for me. Is this true?

What I ended up doing for now is:

import gzip
subprocess.call(args, stdout=f)
f.close()

f = open(filename, 'rb')
zipFilename = filename + '.gz'
f2 = gzip.open(zipFilename, 'wb')
f2.writelines(f)
f2.close()
f.close()

of course this takes a million years, and I hate it.

My Questions: 1. Can I use my first approach on a very large db? 2. Could I possibly pipe the output of mysqldump to a socket and fire it across the network and save it when it arrives, rather than sending a zipped file?

Thanks!

like image 357
Zobal Avatar asked Dec 07 '22 06:12

Zobal


2 Answers

You don't need communicate(). Its only there as a convenience method if you want to read stdout/stderr to completion. But since you are chaining the commands, they are doing that for you. Just wait for them to complete.

from subprocess import Popen, PIPE

args = ['mysqldump', '-u', 'UNAME', '-pPASSWORD', '--add-drop-database', '--databases', 'DB']

with open(FILENAME, 'wb', 0) as f:
    p1 = Popen(args, stdout=PIPE)
    p2 = Popen('gzip', stdin=p1.stdout, stdout=f)
p1.stdout.close() # force write error (/SIGPIPE) if p2 dies
p2.wait()
p1.wait()
like image 151
tdelaney Avatar answered Dec 18 '22 00:12

tdelaney


You are quite close to where you want:

from subprocess import Popen, PIPE

f = open(FILENAME, 'wb')
args = ['mysqldump', '-u', 'UNAME', '-pPASSWORD', '--add-drop-database', '--databases', 'DB']

p1 = Popen(args, stdout=PIPE)

Till here it is right.

p2 = Popen('gzip', stdin=p1.stdout, stdout=PIPE)

This one takes p1's output and processes it. Afterwards we can (and should) immediately p1.stdout.close().

Now we have a p2.stdout which can be read from and, without using a temporary file, send it via the network:

s = socket.create_connection(('remote_pc', port))
while True:
    r = p2.stdout.read(65536)
    if not r: break
    s.send(r)
like image 23
glglgl Avatar answered Dec 17 '22 22:12

glglgl