Python multiprocess/multithreading use for concurrent file copy operation

Question

I am writing python code to copy bunch of big files/folders from one location to other location on desktop (no network, everything is local). I am using shutil module for that.

But the problem is it takes more time so I want to speed up this copy process. I tried using threading and multiprocessing modules. But to my surprise both are taking more time than sequential code.

One more observation is - the time required increases with increase in number of processes for same number of folders. What I mean is suppose I have following directory structure

                       /a/a1, /a/a2, /b/b1 and /b/b2

If I create 2 processes to copy folders a and b the time taken is suppose 2 minutes. Now if I create 4 processes to copy folders a/a1, a/a2, b/b1 and b/b2 it takes around 4 minutes. I tried this only with multiprocessing not sure about threading.

I am not sure what is going. Would any one had similar problem? Are there any best practices which you can share to use multiprocessing/threading in python?

Thanks Abhijit

Joe · Accepted Answer

Your problem is likely IO-bound, so more computational parallelization won't help. As you've seen, you're probably making the problem worse by now requiring the IO requests to jump back and forth between the various threads/processes. There are practical limits to how fast you can move data on a computer, especially where disk is involved.

Python multiprocess/multithreading use for concurrent file copy operation

Tags:

python

multithreading

Abhijit Sawant

1 Answers

Joe

Recent Activity

Donate For Us

Python multiprocess/multithreading use for concurrent file copy operation

Tags:

python

multithreading

Abhijit Sawant

1 Answers

Joe

Related questions

Recent Activity

Donate For Us