GCP stackdriver-agent installed on VM send strange logs every minute

Tags:

google-cloud-monitoring

please can you help me with the following issue.

I have a backend service on node.js I deployed it on GCE VM. It's working fine, but after installing logging and monitoring agent I see very strange logs in Logs Viewer. I looked at the paid that generates that logs. It's stackdriver-agent.

Here are them:

A 2020-05-15T22:45:26Z write_gcm: can not take infinite value
A 2020-05-15T22:45:26Z write_gcm: wg_typed_value_create_from_value_t_inline failed for swap/percent/value! Continuing. 
A 2020-05-15T22:45:26Z write_gcm: can not take infinite value 
A 2020-05-15T22:45:26Z write_gcm: wg_typed_value_create_from_value_t_inline failed for swap/percent/value! Continuing. 
A 2020-05-15T22:45:26Z write_gcm: can not take infinite value 
A 2020-05-15T22:45:26Z write_gcm: wg_typed_value_create_from_value_t_inline failed for swap/percent/value! Continuing. 
A 2020-05-15T22:45:28Z write_gcm: Server response (CollectdTimeseriesRequest) contains errors:#012{#012  "payloadErrors": [#012    {#012      "error": {#012        "code": 3,#012        "message": "Unsupported collectd plugin/type combination: plugin: \"processes\" type: \"io_octets\""#012      }#012    },#012    {#012      "index": 5,#012      "error": {#012        "code": 3,#012        "message": "Unsupported collectd plugin/type combination: plugin: \"processes\" type: \"io_octets\""#012      }#012    },#012    {#012      "index": 10,#012      "error": {#012        "code": 3,#012        "message": "Unsupported collectd plugin/type combination: plugin: \"processes\" type: \"io_octets\""#012      }#012    },#012    {#012      "index": 15,#012      "error": {#012        "code": 3,#012        "message": "Unsupported collectd plugin/type combination: plugin: \"processes\" type: \"io_octets\""#012      }#012    },#012    {#012      "index": 20,#012      "error": {#012        "code": 3,#012        "message": "Unsupported collectd plugin/type combination: plugin: \"processes\" type: \"io_octets\""#012      }#012    },#012    {#012      "index": 25 
A 2020-05-15T22:45:29Z write_gcm: Server response (CollectdTimeseriesRequest) contains errors:#012{#012  "payloadErrors": [#012    {#012      "error": {#012        "code": 3,#012        "message": "Unsupported collectd plugin/type combination: plugin: \"processes\" type: \"io_octets\""#012      }#012    }#012  ]#012} 
A 2020-05-15T22:45:29Z write_gcm: Unsuccessful HTTP request 400: {#012  "error": {#012    "code": 400,#012    "message": "Field timeSeries[3].points[0].interval.start_time had an invalid value of \"2020-05-15T15:45:27.348251-07:00\": The start time must be before the end time (2020-05-15T15:45:27.348251-07:00) for the non-gauge metric 'agent.googleapis.com/agent/api_request_count'.",#012    "status": "INVALID_ARGUMENT"#012  }#012} 
A 2020-05-15T22:45:29Z write_gcm: Error talking to the endpoint. 
A 2020-05-15T22:45:29Z write_gcm: wg_transmit_unique_segment failed. 
A 2020-05-15T22:45:29Z write_gcm: wg_transmit_unique_segments failed. Flushing.

So, every minute I see such logs appear. When I stop stackdriver-agent service, they disappear. I have 4 VMs in my project. And only on two of them such issue appear On Cent OS7 VM and on Ubuntu 18 VM

693

asked May 15 '20 22:05

ESCAPE ROOM DOCTOR

1 Answers

So far there are 2 PITs:

https://issuetracker.google.com/issues/160340568
https://issuetracker.google.com/issues/161054680

Last one has Google engineer explanation for error 400:

These messages are annoying but harmless. You are not losing any metrics. you can safely ignore these logs.

The root cause is a server-side config change and affects all agents. That change only affected the verbosity of the responses, not the processing of the requests. some of the incoming metrics were silently dropped before that change, and are now dropped noisily.

The metrics are sent by default by the upstream collectd plugin, and there are no controls for us to completely prevent those metrics from being sent. The log spam messages result from collectd's internal processing of those metrics.

If you'd like to filter out all the noisy logs you're seeing, you can create a Log Exclusion[1][2] or Log Sink[3][4]. A Log Exclusion will match logs up with a specified filter and drop them from the logs viewer before they come in, and a Log Sink would take logs and direct them to a Storage bucket, Big Query Table, or PubSub topic.

[1] https://cloud.google.com/logging/docs/exclusions#overview

[2] https://cloud.google.com/logging/docs/exclusions#create-filter

[3] https://cloud.google.com/logging/docs/export

[4] https://cloud.google.com/logging/docs/export/configure_export_v2

Regarding swap there is a blog post:

https://myshittycode.com/2020/06/13/gcp-stackdriver-agent-write_gcm-can-not-take-infinite-value-error/

This error occurs because the VM instance does not have swap memory, hence this metric plugin tries to divide by 0.

To fix this, remove this configuration and restart stackdriver-agent.

175

answered Sep 27 '22 15:09

gavenkoa

Related questions
                            
                                Using Python to Query GCP Stackdriver logs
                            
                                Firebase Functions logging objects takes up a lot of space
                            
                                How to detect GKE autoupgrading a node in Stackdriver logs
                            
                                Where can I find Stackdriver in Firebase console?
                            
                                How to Send On Premises Kubernetes Logs to Stackdriver
                            
                                Unable to export to Monitering service because: GaxError RPC failed, caused by 3
                            
                                Problem: empty graphics in GKE cluster node detail (No data for this time interval). How can I fix it?
                            
                                How to use Stackdriver logging on Cloud Run
                            
                                Doing the equivalent of log_struct in python logger
                            
                                Stackdriver Logs-Based Metrics - need sum over alignment period
                            
                                Unable to publish spring boot metrics to GCP stackdriver
                            
                                stackdriver logging client library missing severity with python
                            
                                ChromeOS errors in GCP Logging
                            
                                Retrieve list of log names from Google Cloud Stackdriver API with Python
                            
                                How to clear Stackdriver logs in Google Cloud Platform?
                            
                                How to create StackDriver Workspace in GCP using API or Terraform Template
                            
                                difference between topic/send_message_operation_count and topic/send_request_count in google pubsub

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With