Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DEADLINE_EXCEEDED when publishing to a Cloud Pub/Sub topic from Compute Engine

Tags:

I have a Java application running in a Google Compute Engine instance. I am attempting to publish a message to a Cloud Pub/Sub topic using the google-cloud library, and I am getting DEADLINE_EXCEEDED exceptions. The code looks like this:

PubSub pubSub = PubSubOptions.getDefaultInstance().toBuilder()
            .build().getService();

String messageId = pubSub.publish(topic, message);

The result is:

com.google.cloud.pubsub.PubSubException: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED

The documentation suggests that this response is typically caused by networking issues. Is there something I need to configure in my Networking section to allow Compute Engine to reach Pub/Sub? The default-allow-internal firewall rule is present.

I have already made my Compute Engine service account an editor and publisher in the Pub/Sub topic's permissions.

The application resides in a Docker container within a Container Engine-managed Compute Engine instance. The Pub/Sub topic and the Compute Engine instance are in the same project. I am able to use the google-cloud library to connect to other Cloud Platform services, such as Datastore. I am also able to publish to the same Pub/Sub topic without fail from App Engine instances in the same project.

Would I have more luck using the google-api-services-pubsub API library instead of google-cloud?

like image 726
colintemple Avatar asked Nov 11 '16 18:11

colintemple


2 Answers

I have the same problem at the moment and created an issue at the google-cloud-java issue tracker on GitHub since I couldn't find it there.

We switched from the old google-api-services-pubsub libraries (which worked) to the new ones and got the exception. Our Java application is also running on a Compute Engine instance.

like image 181
codemoped Avatar answered Nov 14 '22 22:11

codemoped


While this can be caused by networking issues (the client cannot connect to the service), the more typical cause is publishing too fast. It is common to call the publish method in a tight loop which can create of thousands to hundreds of thousands within the time it takes a typical request to return. The network stack on a machine will only send so many requests at a time, while the others sit waiting for execution. If your machine is able to send N parallel requests and each request takes 0.1s, in a minute you can send 600N requests. If you publish at a faster rate than that, all additional requests will time out on the client with DEADLINE_EXCEEDED.

You can confirm this by looking at server side metrics in Cloud Monitoring: you will not see these requests and you will only see successful requests. The rate of those requests will tell you the throughput capacity of your machines.

The solution to this is Publisher flow control: limit how fast you are calling the publish method, effectively. You can do this in most client libraries through simple configuration. Please refer to the documentation for the client library publisher API for you client library for details. E.g. in Java, this is a property called FlowControlSettings of the Publisher BatchingSettings. In Python, this is set directly in the PublisherOptions.

like image 43
Kir Titievsky Avatar answered Nov 14 '22 23:11

Kir Titievsky