Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Twemproxy Lag Forces a Restart

We are running a PHP stack on our app servers which use twemproxy locally (via socket), to connect to multiple upstream memcached servers (EC2 small instances) for our caching layer.

Every so often I get an alert from our app monitor that a page load time takes > 5 seconds. When this occurs, the immediate fix is to restart the twemproxy service on each app server, which is a hassle.

The only fix I have now is a crontab that runs every minute and restarts the service, but as you can imagine nothing gets written for a few seconds every minute, which is not a desired, permanent solution.

Has anyone encountered this before? If so, what was the fix? I tried to switch to AWS Elasticache but it didn't have the same performance as our current twemproxy solution.

Here is my twemproxy config.

default:
  auto_eject_hosts: true
  distribution: ketama
  hash: fnv1a_64
  listen: /var/run/nutcracker/nutcracker.sock 0666
  server_failure_limit: 1
  server_retry_timeout: 600000 # 600sec, 10m
  timeout: 100
  servers:

    - vcache-1:11211:1
    - vcache-2:11211:1

And here is the connection config for the php layer:

# Note: We are using HA / twemproxy (nutcracker) / memcached proxy
# So this isn't a default memcache(d) port
# Each webapp will host the cache proxy, which allows us to connect via socket
#   which should be faster, as no tcp overhead
# Hash has been manually override from default jenkins to FNV1A_64, which directly aligns with proxy
port: 0
<?php echo Hobis_Api_Cache::TYPE_VOLATILE; ?>:
  options:
    - <?php echo Memcached::OPT_HASH; ?>: <?php echo Memcached::HASH_FNV1A_64; ?><?php echo PHP_EOL; ?>
    - <?php echo Memcached::OPT_SERIALIZER; ?>: <?php echo Memcached::SERIALIZER_IGBINARY; ?><?php echo PHP_EOL; ?>
  servers:
    - /var/run/nutcracker/nutcracker.sock

We are running 0.4.1 twemproxy and 1.4.25 memcached.

Thanks.

like image 766
Mike Purcell Avatar asked Feb 01 '17 21:02

Mike Purcell


1 Answers

The number of open / stale socket connections may be the issue

like image 79
Kiran Avatar answered Oct 31 '22 13:10

Kiran