I have had a task assigned to me to think of a way to set up a cloud function in GCP that does the following:
Monitors a Google Cloud Storage bucket for new files
Triggers when it detects a new file in the bucket
Copies that file to a directory inside a Compute Instance (Ubuntu)
I've been doing some research and am coming up empty. I know I can easily set up a cron job that syncs the bucket/directory every minute or something like that, but one of the design philosophies of the system we are building is to operate off triggers rather than timers.
Is what I am asking possible?
You can trigger a Cloud Function from a Google Cloud Storage bucket, and by selecting the Event Type to be Finalize/Create, each time a file is uploaded in the bucket, the Cloud Function will be called.
Each time a new object is created in the bucket, the cloud function will receive a notification with a Cloud Storage object format.
Now, onto the second step, I could not find any API that can upload files from Cloud Storage to an instance VM. However, I did the following as a workaround, assuming that your instance VM has a server configured that can receive HTTP requests (for example Apache or Nginx):
main.py
import requests
from google.cloud import storage
def hello_gcs(data, context):
"""Background Cloud Function to be triggered by Cloud Storage.
Args:
data (dict): The Cloud Functions event payload.
context (google.cloud.functions.Context): Metadata of triggering event.
Returns:
None; the file is sent as a request to
"""
print('Bucket: {}'.format(data['bucket']))
print('File: {}'.format(data['name']))
client = storage.Client()
bucket = client.get_bucket(data['bucket'])
blob = bucket.get_blob(data['name'])
contents = blob.download_as_string()
headers = {
'Content-type': 'text/plain',
}
data = '{"text":"{}"}'.format(contents)
response = requests.post('https://your-instance-server/endpoint-to-download-files', headers=headers, data=data)
return "Request sent to your instance with the data of the object"
requirements.txt
google-cloud-storage
requests
Most likely, it would be better to just send the object name and the bucket name to your server endpoint, and from there download the files using the Cloud Client Library.
Now you may ask...
How to make a Compute Engine instance to handle the request?
Create a Compute Engine instance VM. Make sure it's in the same region as the cloud Function, and when creating it, allow HTTP connections to it. Documentation. I used a debian-9
image for this test.
SSH into the instance, and run the following commands:
Install apache server
sudo apt-get update
sudo apt-get install apache2
sudo apt-get install libapache2-mod-wsgi
Install this python libraries as well:
sudo apt-get install python-pip
sudo pip install flask
Set up environment for your application:
cd ~/
mkdir app
sudo ln -sT ~/app /var/www/html/app
Last line should point to the folder path where apache serves the index.html file from.
/home/<user_name>/app
:main.py
from flask import Flask, request
app = Flask(__name__)
@app.route('/', methods=['POST'])
def receive_file():
file_content = request.form['data']
# TODO
# Implement process to save this data onto a file
return 'Hello from Flask!'
if __name__ == '__main__':
app.run()
main.wsgi
import sys
sys.path.insert(0, '/var/www/html/app')
from main import app as application
Add the following line to /etc/apache2/sites-enabled/000-default.conf
, after the DocumentRoot
tag:
WSGIDaemonProcess flaskapp threads=5
WSGIScriptAlias / /var/www/html/app/main.wsgi
<Directory app>
WSGIProcessGroup main
WSGIApplicationGroup %{GLOBAL}
Order deny,allow
Allow from all
</Directory>
Run sudo apachectl restart
. You should be able to send post requests to your application, to the internal IP of the VM instance (you can see it in the Console, in the Compute Engine section). Once you have it, in your cloud function, you should change the response line to:
response = requests.post('<INTERNAL_INSTANCE_IP>/', headers=headers, data=data)
return "Request sent to your instance with the data of the object"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With