I'm following the documentation for Google Cloud Composer to install Python dependencies from PyPI in an environment. I used this command to install the libraries from a requirements file:
$ gcloud composer environments update $ENV_NAME \
--update-pypi-packages-from-file requirements.txt \
--location us-east4
It was just a test and this requirements file only has 4 libraries to install, but it takes more than 20 minutes to finish to execute this command. So I tried to use the user interface and install a single package from there, but it takes almost the same time.
Something is not making sense to me, when I execute these commands the environment enters in a "updating state" and takes several minutes to be ready again. Why does Composer take so long to perform a pip install
?
Has anyone already faced a problem similar to that? How do you manage the installation of Python dependencies in Composer?
The reason Cloud Composer environments take so long to update is because the service deploys Airflow in a distributed setup within Google Kubernetes Engine and App Engine (for the webserver). This means the service has to take care of building/rebuilding container images, redeploying them to your cluster, updating the webserver app, etc.
While this does mean the installation of packages or updates to the environment may take a bit of time, it's what makes Composer easy to use - providing you a one-shot equivalent to pip install
even if you have dozens of worker nodes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With