Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MEMSQL vs. MySQL

I need to start off by pointing out that by no means am I a database expert in any way. I do know how to get around to programming applications in several languages that require database backends, and am relatively familiar with MySQL, Microsoft SQL Server and now MEMSQL - but again, not an expert at databases so your input is very much appreciated.

I have been working on developing an application that has to cross reference several different tables. One very simple example of an issue I recently had, is I have to:

  1. On a daily basis, pull down 600K to 1M records into a temporary table.
  2. Compare what has changed between this new data pull and the old one. Record that information on a separate table.
  3. Repopulate the table with the new records.

Running #2 is a query similar to:

SELECT * FROM (NEW TABLE) LEFT JOIN (OLD TABLE) ON (JOINED FIELD) WHERE (OLD TABLE.FIELD) IS NULL

In this case, I'm comparing the two tables on a given field and then pulling the information of what has changed.

In MySQL (v5.6.26, x64), my query times out. I'm running 4 vCPUs and 8 GB of RAM but note that the rest of my configuration is default configuration (did not tweak any parameters).

In MEMSQL (v5.5.8, x64), my query runs in about 3 seconds on the first try. I'm running the exact same virtual server configuration with 4 vCPUs and 8 GB of RAM, also note that the rest of my configuration is default configuration (did not tweak any parameters).

Also, in MEMSQL, I am running a single node configuration. Same thing for MySQL.

I love the fact that using MEMSQL allowed me to continue developing my project, and I'm coming across even bigger cross-table calculation queries and views that I can run that are running fantastically on MEMSQL... but, in an ideal world, i'd use MySQL. I've already come across the fact that I need to use a different set of tools to manage my instance (i.e.: MySQL Workbench works relatively well with a MEMSQL server but I actually need to build views and tables using the open source SQL Workbench and the mysql java adapter. Same thing for using the Visual Studio MySQL connector, works, but can be painful at times, for some reason I can add queries but can't add table adapters)... sorry, I'll submit a separate question for that :)

Considering both virtual machines are exactly the same configuration, and SSD backed, can anyone give me any recommendations on how to tweak my MySQL instance to run big queries like the one above on MySQL? I understand I can also create an in-memory database but I've read there might be some persistence issues with doing that, not sure.

Thank you!

like image 575
JDA Avatar asked Sep 24 '15 18:09

JDA


People also ask

Is MySQL and MemSQL are same?

MemSQL and MySQL are primarily classified as "In-Memory Databases" and "Databases" tools respectively.

What is MemSQL used for?

SingleStore (formerly MemSQL) is a cloud-native database designed for data-intensive applications. A distributed, relational, SQL database management system (RDBMS) that features ANSI SQL support, it is known for speed in data ingest, transaction processing, and query processing.

Is SQLite3 faster than MySQL?

SQLite3 is much faster than MySQL database. It's because file database is always faster than unix socket. When I requested edit of channel it took about 0.5-1 sec on MySQL database (127.0. 0.1) and almost instantly (0.1 sec) on SQLite 3.

Is SQLite3 better than MySQL?

MySQL, while also easy to use, provides a more robust set of features, better security, better performance at scale and overall better multi-user management. SQLite lacks these features and optimizations.


1 Answers

The most likely reason this happens is because you don't have index on your joined field in one or both tables. According to this article:

https://www.percona.com/blog/2012/04/04/join-optimizations-in-mysql-5-6-and-mariadb-5-5/

Vanilla MySQL only supports nested loop joins, that require the index to perform well (otherwise they take quadratic time).

Both MemSQL and MariaDB support so-called hash join, which does not require you to have indexes on the tables, but consumes more memory. Since your dataset is negligibly small for modern RAM sizes, that extra memory overhead is not noticed in your case.

So all you need to do to address the issue is to add indexes on joined field in both tables.

Also, please describe the issues you are facing with the open source tools when connect to MemSQL in a separate question, or at chat.memsql.com, so that we can fix it in the next version (I work for MemSQL, and compatibility with MySQL tools is one of the priorities for us).

like image 144
Ishamael Avatar answered Oct 17 '22 20:10

Ishamael