Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Debugging istio rate limiting handler

I'm trying to apply rate limiting on some of our internal services (inside the mesh).

I used the example from the docs and generated redis rate limiting configurations that include a (redis) handler, quota instance, quota spec, quota spec binding and rule to apply the handler.

This redis handler:

apiVersion: config.istio.io/v1alpha2
kind: handler
metadata:
  name: redishandler
  namespace: istio-system
spec:
  compiledAdapter: redisquota
  params:
    redisServerUrl: <REDIS>:6379
    connectionPoolSize: 10
    quotas:
    - name: requestcountquota.instance.istio-system
      maxAmount: 10
      validDuration: 100s
      rateLimitAlgorithm: FIXED_WINDOW
      overrides:
      - dimensions:
          destination: s1
        maxAmount: 1
      - dimensions:
          destination: s3
        maxAmount: 1
      - dimensions:
          destination: s2
        maxAmount: 1

The quota instance (I'm only interested in limiting by destination at the moment):

apiVersion: config.istio.io/v1alpha2
kind: instance
metadata:
  name: requestcountquota
  namespace: istio-system
spec:
  compiledTemplate: quota
  params:
    dimensions:
      destination: destination.labels["app"] | destination.service.host | "unknown"

A quota spec, charging 1 per request if I understand correctly:

apiVersion: config.istio.io/v1alpha2
kind: QuotaSpec
metadata:
  name: request-count
  namespace: istio-system
spec:
  rules:
  - quotas:
    - charge: 1
      quota: requestcountquota

A quota binding spec that all participating services pre-fetch. I also tried with service: "*" which also did nothing.

apiVersion: config.istio.io/v1alpha2
kind: QuotaSpecBinding
metadata:
  name: request-count
  namespace: istio-system
spec:
  quotaSpecs:
  - name: request-count
    namespace: istio-system
  services:
  - name: s2
    namespace: default
  - name: s3
    namespace: default
  - name: s1
    namespace: default
    # - service: '*'  # Uncomment this to bind *all* services to request-count

A rule to apply the handler. Currently on all occasions (tried with matches but didn't change anything as well):

apiVersion: config.istio.io/v1alpha2
kind: rule
metadata:
  name: quota
  namespace: istio-system
spec:
  actions:
  - handler: redishandler
    instances:
    - requestcountquota

The VirtualService definitions are pretty similar for all participants:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: s1
spec:
  hosts:
  - s1

  http:
  - route:
    - destination:
        host: s1

The problem is nothing really happens and no rate limiting takes place. I tested with curl from pods inside the mesh. The redis instance is empty (no keys on db 0, which I assume is what the rate limiting would use) so I know it can't practically rate-limit anything.

The handler seems to be configured properly (how can I make sure?) because I had some errors in it which were reported in mixer (policy). There are still some errors but none which I associate to this problem or the configuration. The only line in which redis handler is mentioned is this:

2019-12-17T13:44:22.958041Z info    adapters    adapter closed all scheduled daemons and workers    {"adapter": "redishandler.istio-system"}   

But its unclear if its a problem or not. I assume its not.

These are the rest of the lines from the reload once I deploy:

2019-12-17T13:44:22.601644Z info    Built new config.Snapshot: id='43'
2019-12-17T13:44:22.601866Z info    adapters    getting kubeconfig from: "" {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.601881Z warn    Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2019-12-17T13:44:22.602718Z info    adapters    Waiting for kubernetes cache sync...    {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.903844Z info    adapters    Cache sync successful.  {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.903878Z info    adapters    getting kubeconfig from: "" {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.903882Z warn    Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2019-12-17T13:44:22.904808Z info    Setting up event handlers
2019-12-17T13:44:22.904939Z info    Starting Secrets controller
2019-12-17T13:44:22.904991Z info    Waiting for informer caches to sync
2019-12-17T13:44:22.957893Z info    Cleaning up handler table, with config ID:42
2019-12-17T13:44:22.957924Z info    adapters    deleted remote controller   {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.957999Z info    adapters    adapter closed all scheduled daemons and workers    {"adapter": "prometheus.istio-system"}
2019-12-17T13:44:22.958041Z info    adapters    adapter closed all scheduled daemons and workers    {"adapter": "redishandler.istio-system"}   
2019-12-17T13:44:22.958065Z info    adapters    shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.958050Z info    adapters    shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.958096Z info    adapters    shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:22.958182Z info    adapters    shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:44:23.958109Z info    adapters    adapter closed all scheduled daemons and workers    {"adapter": "kubernetesenv.istio-system"}
2019-12-17T13:55:21.042131Z info    transport: loopyWriter.run returning. connection error: desc = "transport is closing"
2019-12-17T14:14:00.265722Z info    transport: loopyWriter.run returning. connection error: desc = "transport is closing"

I'm using the demo profile with disablePolicyChecks: false to enable rate limiting. This is on istio 1.4.0, deployed on EKS.

I also tried memquota (this is our staging environment) with low limits and nothing seems to work. I never got a 429 no matter how much I went over the rate limit configured.

I don't know how to debug this and see where the configuration is wrong causing it to do nothing.

Any help is appreciated.

like image 818
Reut Sharabani Avatar asked Dec 17 '19 16:12

Reut Sharabani


1 Answers

I too spent hours trying to decipher the documentation and get a sample working.

According to the documentation, they recommended that we enable policy checks:

https://istio.io/docs/tasks/policy-enforcement/rate-limiting/

However when that did not work, I did an "istioctl profile dump", searched for policy, and tried several settings.

I used Helm install and passed the following and then was able to get the described behaviour:

--set global.disablePolicyChecks=false \ --set values.pilot.policy.enabled=true \ ===> this made it work, but it's not in the docs.

like image 52
Sateesh Valluru Avatar answered Oct 27 '22 01:10

Sateesh Valluru