Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lost connection to MySQL server during query on random simple queries

FINAL UPDATE: We fixed this problem by finding a way to accomplish our goals without forking. But forking was the cause of the problem.

---Original Post---

I'm running a ruby on rails stack, our mysql server is separate, but housed at the same site as our app servers. (we've tried swapping it out for a different mysql server with double the specs, but no improvement was seen.

during business hours we get a handful of these from no particular query.

ActiveRecord::StatementInvalid: Mysql2::Error: Lost connection to MySQL server during query

most of the queries that fail are really simple, and there seems to be no pattern between one query and another. This all started when I upgraded from Rails 4.1 to 4.2.

I'm at a loss as to what to try. Our database server is less than 5% CPU throughout the day. I do get bug reports from users who have random interactions fail due to this, so it's not queries that have been running for hours or anything like that, of course when they retry the exact same thing it works.

Our servers are configured by cloud66.

So in short: our mysql server is going away for some reason, but it's not because of lack of resources, it's also a brand new server as we migrated from another server when this problem started.

this also happens to me on localhost while developing features sometimes, so I don't believe it's a load issue.

We're running the following:

  • ruby 2.2.5
  • rails 4.2.6
  • mysql2 0.4.8

UPDATE: per the first answer below I increased our max_connections variable to 500 last night, and confirmed the increase via show global variables like 'max_connections';

I'm still getting dropped connection, the first one today was dropped only a few minutes ago.... ActiveRecord::StatementInvalid: Mysql2::Error: Lost connection to MySQL server during query

I ran select * from information_schema.processlist; and I got 36 rows back. Does this mean my app servers were running 36 connections at that moment? or can a process be multiple connections?

UPDATE: I just set net_read_timeout = 60 (it was 30 before) I'll see if that helps

UPDATE: It didn't help, I'm still looking for a solution...

Heres my Database.yml with credentials removed.

production:
  adapter: mysql2
  encoding: utf8
  host: localhost
  database:
  username: 
  password: 
  port: 3306
  reconnect: true
like image 215
denodster Avatar asked Jul 18 '17 18:07

denodster


2 Answers

The connection to MySQL can be disrupted by a number of means, but I would recommend revisiting Mario Carrion's answer since it's a very wise answer.

It seems likely that connection is disrupted because it's being shared with the other processes, causing communication protocol errors...

...this could easily happen if the connection pool is process bound, which I believe it is, in ActiveRecord, meaning that the same connection could be "checked-out" a number of times simultaneously in different processes.

The solution is that database connections must be established only AFTER the fork statement in the application server.

I'm not sure which server you're using, but if you're using a warmup feature - don't.

If you're running any database calls before the first network request - don't.

Either of these actions could potentially initialize the connection pool before forking occurs, causing the MySQL connection pool to be shared between processes while the locking system isn't.

I'm not saying this is the only possible reason for the issue, as stated by @sloth-jr, there are other options... but most of them seem less likely according to your description.

Sidenote:

I ran select * from information_schema.processlist; and I got 36 rows back. Does this mean my app servers were running 36 connections at that moment? or can a process be multiple connections?

Each process could hold a number of connections. In your case, you might have up to 500X36 connections. (see edit)

In general, the number of connections in the pool can often be the same as the number of threads in each process (it shouldn't be less than the number of thread, or contention will slow you down). Sometimes it's good to add a few more depending on your application.

EDIT:

I apologize for ignoring the fact that the process count was referencing the MySQL data and not the application data.

The process count you showed is the MySQL server data, which seems to use a thread per connection IO scheme. The "Process" data actually counts active connections and not actual processes or threads (although it should translate to the number of threads as well).

This means that out of possible 500 connections per application processes (i.e., if you're using 8 processes for your application, that would be 8X500=4,000 allowed connections) your application only opened 36 connections so far.

like image 55
Myst Avatar answered Sep 28 '22 02:09

Myst


This indicates a timeout error. It's usually a general resource or connection error.

I would check your MySQL config for max connections on MySQL console:

show global variables like 'max_connections';

And ensure the number of pooled connections used by Rails database.yml is less than that:

pool: 10

Note that database.yml reflects number of connections that will be pooled by a single Rails process. If you have multiple processes or other servers like Sidekiq, you'll need to add them together.

Increase max_connections if necessary in your MySQL server config (my.cnf), assuming your kit can handle it.

[mysqld]
max_connections = 100

Note other things might be blocking too, e.g. open files, but looking at connections is a good starting point.

You can also monitor active queries:

select * from information_schema.processlist;

as well as monitoring the MySQL slow log.

One issue may be a long-running update command. If you have a slow-running command that affects a lot of records (e.g. a whole table), it might be blocking even the simplest queries. This means you could see random queries timeout, but if you check MySQL status, the real cause is another long-running query.

like image 24
mahemoff Avatar answered Sep 28 '22 02:09

mahemoff