Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I upload many files at the time to cloud files with Python?

I'm using the cloudfile module to upload files to rackspace cloud files, using something like this pseudocode:

import cloudfiles

username = '---'
api_key = '---'

conn = cloudfiles.get_connection(username, api_key)
testcontainer = conn.create_container('test')

for f in get_filenames():
    obj = testcontainer.create_object(f)
    obj.load_from_filename(f)

My problem is that I have a lot of small files to upload, and it takes too long this way.

Buried in the documentation, I see that there is a class ConnectionPool, which supposedly can be used to upload files in parallell.

Could someone please show how I can make this piece of code upload more than one file at a time?

like image 955
Hobhouse Avatar asked Mar 09 '11 16:03

Hobhouse


People also ask

How do you upload multiple files in Python?

Run the Application by running “python multiplefilesupload.py”. Go to browser and type “http://localhost:5000”, you will see “upload files” in browser.

How do I put multiple files at a time?

Browse to the files you want to upload from your computer and use Ctrl/Cmd +select to choose multiple files. Select Upload.


1 Answers

The ConnectionPool class is meant for a multithreading application that ocasionally has to send something to rackspace.

That way you can reuse your connection but you don't have to keep 100 connections open if you have 100 threads.

You are simply looking for a multithreading/multiprocessing uploader. Here's an example using the multiprocessing library:

import cloudfiles
import multiprocessing

USERNAME = '---'
API_KEY = '---'


def get_container():
    conn = cloudfiles.get_connection(USERNAME, API_KEY)
    testcontainer = conn.create_container('test')
    return testcontainer

def uploader(filenames):
    '''Worker process to upload the given files'''
    container = get_container()

    # Keep going till you reach STOP
    for filename in iter(filenames.get, 'STOP'):
        # Create the object and upload
        obj = container.create_object(filename)
        obj.load_from_filename(filename)

def main():
    NUMBER_OF_PROCESSES = 16

    # Add your filenames to this queue
    filenames = multiprocessing.Queue()

    # Start worker processes
    for i in range(NUMBER_OF_PROCESSES):
        multiprocessing.Process(target=uploader, args=(filenames,)).start()

    # You can keep adding tasks until you add STOP
    filenames.put('some filename')

    # Stop all child processes
    for i in range(NUMBER_OF_PROCESSES):
        filenames.put('STOP')

if __name__ == '__main__':
    multiprocessing.freeze_support()
    main()
like image 168
Wolph Avatar answered Oct 17 '22 05:10

Wolph