Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to setup error reporting in Stackdriver from kubernetes pods?

I'm a bit confused at how to setup error reporting in kubernetes, so errors are visible in Google Cloud Console / Stackdriver "Error Reporting"?

According to documentation https://cloud.google.com/error-reporting/docs/setting-up-on-compute-engine we need to enable fluentd' "forward input plugin" and then send exception data from our apps. I think this approach would have worked if we had setup fluentd ourselves, but it's already pre-installed on every node in a pod that just runs gcr.io/google_containers/fluentd-gcp docker image.

How do we enable forward input on those pods and make sure that http port available to every pod on the nodes? We also need to make sure this config is used by default when we add more nodes to our cluster.

Any help would be appreciated, may be I'm looking at all this from a wrong point?

like image 506
s3ncha Avatar asked Apr 02 '16 22:04

s3ncha


People also ask

Why is my Kubernetes pod not starting containers?

The specified container is either not present or not managed by the kubelet, within the declared pod. Container initialization failed. Pod’s containers don’t start successfully due to misconfiguration. None of the pod’s containers were killed successfully. A container has terminated. The kubelet will not attempt to restart it.

What is the default termination message path for a Kubernetes pod?

The default termination message path is /dev/termination-log. You cannot set the termination message path after a Pod is launched In the following example, the container writes termination messages to /tmp/my-logfor Kubernetes to retrieve:

What is the status message of a Kubernetes container?

Kubernetes use the contents from the specified file to populate the Container's status message on both success and failure. The termination message is intended to be brief final status, such as an assertion failure message.

How to get the pod error details?

In most cases, you can get the pod error details by describing the pod event. With the error message, you can figure out the cause of pod failure and rectify it. How to Troubleshoot Pod Errors? How to Troubleshoot Pod Errors? The first step in troubleshooting a pod is getting the status of the pods.


1 Answers

The basic idea is to start a separate pod that receives structured logs over TCP and forwards it to Cloud Logging, similar to a locally-running fluentd agent. See below for the steps I used.

(Unfortunately, the logging support that is built into Docker and Kubernetes cannot be used - it just forwards individual lines of text from stdout/stderr as separate log entries which prevents Error Reporting from seeing complete stack traces.)

Create a docker image for a fluentd forwarder using a Dockerfile as follows:

FROM gcr.io/google_containers/fluentd-gcp:1.18

COPY fluentd-forwarder.conf /etc/google-fluentd/google-fluentd.conf

Where fluentd-forwarder.conf contains the following:

<source>
  type forward
  port 24224
</source>

<match **>
  type google_cloud
  buffer_chunk_limit 2M
  buffer_queue_limit 24
  flush_interval 5s
  max_retry_wait 30
  disable_retry_limit
</match>

Then build and push the image:

$ docker build -t gcr.io/###your project id###/fluentd-forwarder:v1 .
$ gcloud docker push gcr.io/###your project id###/fluentd-forwarder:v1

You need a replication controller (fluentd-forwarder-controller.yaml):

apiVersion: v1
kind: ReplicationController
metadata:
  name: fluentd-forwarder
spec:
  replicas: 1
  template:
    metadata:
      name: fluentd-forwarder
      labels:
        app: fluentd-forwarder
    spec:
      containers:
      - name: fluentd-forwarder
        image: gcr.io/###your project id###/fluentd-forwarder:v1
        env:
        - name: FLUENTD_ARGS
          value: -qq
        ports:
        - containerPort: 24224

You also need a service (fluentd-forwarder-service.yaml):

apiVersion: v1
kind: Service
metadata:
  name: fluentd-forwarder
spec:
  selector:
    app: fluentd-forwarder
  ports:
  - protocol: TCP
    port: 24224

Then create the replication controller and service:

$ kubectl create -f fluentd-forwarder-controller.yaml
$ kubectl create -f fluentd-forwarder-service.yaml

Finally, in your application, instead of using 'localhost' and 24224 to connect to the fluentd agent as described on https://cloud.google.com/error-reporting/docs/setting-up-on-compute-engine, use the values of evironment variables FLUENTD_FORWARDER_SERVICE_HOST and FLUENTD_FORWARDER_SERVICE_PORT.

like image 68
Boris Bokowski Avatar answered Sep 28 '22 04:09

Boris Bokowski