Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cloud storage and secure download strategy on app engine. GCS acl or blobstore

My appengine app creates cloudstorage files. The files will be downloaded by a third party. The files contain personal medical information.

What would be the preferred way of downloading:

  1. Using a direct GCS download link with a user READER acl.
  2. Or using a blobstore download handler in an appengine app.

Both solutions require the third party to login (google login). Performance is not an issue. Privacy and the occurrence of security errors and mistakes are.

Using an encrypted zip file to download is an option. This means I have to store the password in the project. Or e-mail a random password?

Update The appengine code I used to create a signed download url

import time
import urllib
from datetime import datetime, timedelta
from google.appengine.api import app_identity
import os
import base64

API_ACCESS_ENDPOINT = 'https://storage.googleapis.com'

# Use the default bucket in the cloud and not the local SDK one from app_identity
default_bucket = '%s.appspot.com' % os.environ['APPLICATION_ID'].split('~', 1)[1]
google_access_id = app_identity.get_service_account_name()


def sign_url(bucket_object, expires_after_seconds=60):
    """ cloudstorage signed url to download cloudstorage object without login
        Docs : https://cloud.google.com/storage/docs/access-control?hl=bg#Signed-URLs
        API : https://cloud.google.com/storage/docs/reference-methods?hl=bg#getobject
    """

    method = 'GET'
    gcs_filename = '/%s/%s' % (default_bucket, bucket_object)
    content_md5, content_type = None, None

    expiration = datetime.utcnow() + timedelta(seconds=expires_after_seconds)
    expiration = int(time.mktime(expiration.timetuple()))

    # Generate the string to sign.
    signature_string = '\n'.join([
        method,
        content_md5 or '',
        content_type or '',
        str(expiration),
        gcs_filename])

    _, signature_bytes = app_identity.sign_blob(signature_string)
    signature = base64.b64encode(signature_bytes)

    # Set the right query parameters.
    query_params = {'GoogleAccessId': google_access_id,
                    'Expires': str(expiration),
                    'Signature': signature}

    # Return the download URL.
    return '{endpoint}{resource}?{querystring}'.format(endpoint=API_ACCESS_ENDPOINT,
                                                       resource=gcs_filename,
                                                       querystring=urllib.urlencode(query_params))
like image 575
voscausa Avatar asked Oct 20 '22 13:10

voscausa


1 Answers

If a small number of users have access to all the files in the bucket, then solution #1 would be sufficient, as managing the ACL would not be too much of a pain.

However, if you have many different users who each require different access to the different files in the bucket, then solution #1 is impractical.

I'd avoid solution #2 as well, as you'd be paying for unnecessary incoming/outgoing GAE bandwidth.

Maybe a third solution to consider, would be to use App Engine handle authentication, and write logic to determine which users have access to which files. Then, when a file is requested for download, you create Signed URLs to download the data direct from GCS. You can set the expiration parameter to a value that works for you, which would invalidate the URL after a set amount of time.

like image 116
Gwyn Howell Avatar answered Oct 22 '22 23:10

Gwyn Howell