Optimize MySQL System Variables for High Performance

Optimize MySQL System Variables for High Performance

For most application developers, the database is an altar of demon gods that is best left untouched. But it doesn’t have to be like that!

When other things are the same, the convenience that a developer with the underlying database defines his seniority. Little database and little programming experience = up-and-coming developers; little database and good coding experience = medium-level developers; Good database and good coding experience = Senior Developers.

It’s a harsh reality that even developers at 6-8 years of age have trouble explaining the intricacies of the query optimizer, and prefer to look to the sky. Surprisingly, the reason is not laziness (although it is in some parts).

The point is that databases are a force of their own that you have to deal with. Even traditionally, if there were only the relational types of databases to deal with, it was a miracle and a career path in itself to dominate them. These days we have so many types of databases that it is simply impossible to expect a single mortal soul to dominate everything.

However, there is a good chance that you are still satisfied with relational databases or are part of a team in which a product runs satisfactorily on a relational database for a long, long time. In nine out of ten cases, you use MySQL (or MariaDB). In these cases, diving a little deeper under the hood offers enormous benefits in increasing application performance and is worth learning in every way.

Optimize the MySQL query cache

Almost all optimizations in the field of computers are due to caching. The CPU manages multiple cache levels to speed up its calculations, and on the other hand, web apps aggressively use caching solutions such as Redis to send precalculated results to users instead of accessing the database each time.

But hey, even the weak MySQL database has its own query cache! That is, every time you query something and the data is still out of date, MySQL provides these cached results instead of rerunning the query, making the app ridiculously faster.

You can verify that a query cache is available in your database (note, available, not enabled) by running this query in the database console:

MariaDB [(none)]> SHOW VARIABLES LIKE 'have_query_cache';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| have_query_cache | YES   |
+------------------+-------+

So you can see that we are running MariaDB and that query caching is available for activation. It is highly unlikely that you have disabled it if you are using a standard MySQL installation.

Now let’s see if I actually enabled the query cache:

MariaDB [(none)]> SHOW VARIABLES LIKE 'query_cache_type';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| query_cache_type | ON    |
+------------------+-------+

Yes, I want it. If you don’t, you can turn it on by saying:

MariaDB [(none)]> SET GLOBAL query_cache_type = ON;

Interestingly, this variable also accepts a third value “on-demand”, which means that MySQL only caches the queries we tell it.

This enables query caching and takes the first step towards a more robust MySQL setup! we say the first step because while powering on is a significant improvement, we need to adjust the caching of queries to our setup. So let’s learn to do that.

The other variable of interest is here, whose function is self-explanatory:query_cache_size

MariaDB [(none)]> SHOW VARIABLES LIKE 'query_cache_size';
+------------------+----------+
| Variable_name    | Value    |
+------------------+----------+
| query_cache_size | 16777216 |
+------------------+----------+

So we have a query cache of about 16 MB in size. Note that caching is disabled even if query caching is enabled, but this size is zero. Therefore, it is not enough to check only one variable. Now you should set a query cache size, but how much should it be? First, note that the query caching feature itself requires 4 KB to store its metadata. Whatever you choose should be above it.

For example, suppose you set the query cache to 500 KB:

MariaDB [(none)]> SET GLOBAL query_cache_size = 500000;

Is that enough? No, because how the query engine actually works depends on a few other things:

  • First of all, the variable must be large enough to accommodate the result of your queries. If it is too small, nothing is cached.query_cache_size
  • Second, if the number is set too high, there are two types of problems:
    1. The engine needs to do additional work to store and locate query results in this huge memory area.
    2. If most queries result in much smaller dimensions, the cache becomes fragmented and the benefits of using a cache are lost.query_cache_size

How do you know that the cache is being fragmented? Check the total number of blocks in the cache as follows:

MariaDB [(none)]> show status like 'Qcache_total_blocks';
+---------------------+-------+
| Variable_name       | Value |
+---------------------+-------+
| Qcache_total_blocks | 33    |
+---------------------+-------+

If the number is very high, the cache is fragmented and needs to be emptied.

To avoid these problems, make sure that the size is carefully chosen. If you are frustrated that we have not left you a specific number here, unfortunately, this is the case if you leave the development behind and get into technology. You need to look at the app you’re running, determine the query sizes for the important query results, and then set that number. And even then, you could make a mistake.query_cache_size

Threading, Thread Pools, Waiting, and Timeouts

This is probably the most interesting part of how MySQL works. To get it right, you need to make your app many times faster!

Threading

MySQL is a multithreaded server. That is, every time there is a new connection to the MySQL server, a new thread with the connection data is opened and a handle is passed to the client (just in case you are wondering what a thread is, see us missing the words). The client then sends all queries over this thread and receives results. This leads us to a natural question: How many threads can MySQL boot? The answer is in the next section.

Thread pool

No program in a computer system can open as many threads as it wants. The reason is twofold:

  1. Threads cost memory (RAM), and the operating system just doesn’t allow you to get angry and devour everything.
  2. Managing, for example, one million threads is a daunting task in itself. If the MySQL server could create so many threads, it would die trying to cope with the overhead.

To avoid these problems, MySQL comes with a thread pool – a fixed number of threads that are initially part of a pool. New connection requests cause MySQL to record one of these threads and return the connection data. Of course, when all threads are used up, new connections are rejected. Let’s see how big the thread pool is:

MariaDB [(none)]> show variables like 'thread_pool_size';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| thread_pool_size | 4     |
+------------------+-------+

So the computer allows a maximum of four connections at a time. It is interesting to note that number 4 comes from the fact that we have a four-core processor, which means that my computer can only perform 4 parallel tasks at a time.

Ideally, this is the limit that should determine the value, but with stronger machines, it benefits up to a point. If you don’t want to have all the new connections serviced and are okay to achieve a performance drop (this is also an area that you can best judge based on the performance of your app under load), it might be a good idea to increase it to 8.thread_pool_size

However, setting a value above 16 is a terrible idea unless you have a 32-core computer because performance is significantly degrading. The rabbit hole of the thread pools in MySQL goes deep, but if you’re interested, heres a more detailed discussion.

Waiting and time-outs

Once a thread was created and attached to a client, it would be a waste of resources if the client did not send requests in the next few seconds (or minutes). As a result, MySQL terminates a connection after a period of inactivity. This is controlled by the variable:wait_timeout

MariaDB [(none)]> show variables like 'wait%';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| wait_timeout  | 28800 |
+---------------+-------+

The resulting value is in seconds. So by default MySQL is set to wait more than 8 hours before cutting the cable! This can be good if you have long-running questions and actually want to wait for them (but even then eight hours is absurd!), but in most cases terrible. When a query is executed, this value is set to 0 (which means forever). In general, however, this value should be set to a very low value (for example, 5 seconds or less) to release the connection to other processes.

Optimize temporary tables

Let’s start with the temporary tables in MySQL.

Suppose we have a MySQL that looks structurally like this: TABLE A UNION (TABLE B INNER JOIN C). In other words, we are interested in combining Tables B and C and then combining the result with Table A. Now MySQL would first join tables B and C, but before it can perform a union, it needs to store this data somewhere. This is where temporary tables come into play – MySQL uses them to temporarily store data in intermediate sections in complex queries. Once the query is finished, this temporary table is discarded.

The question now is: why should we deal with all this?

Just because the temporary table, just a query result, is data used by MySQL in the calculation, determine the speed of access (among other things), and how fast the query runs. For example, saving the temporary table to RAM is many times faster than saving it to disk.

Two variables control this behavior:

MariaDB [(none)]> show variables like 'MariaDB [(none)]> show variables like 'tmp_table_size';
+----------------+----------+

| Variable_name  | Value    |

+----------------+----------+

| tmp_table_size | 16777216 |

+----------------+----------+
';
+---------------------+----------+
| Variable_name       | Value    |
+---------------------+----------+
| max_heap_table_size | 16777216 |
+---------------------+----------+

MariaDB [(none)]> show variables like 'tmp_table_size';
+----------------+----------+
| Variable_name  | Value    |
+----------------+----------+
| tmp_table_size | 16777216 |
+----------------+----------+

The first, tells us how much RAM can be consumed by a MySQL table (“Heap” refers here to the data structure used in RAM allocation and management – read more here), while the second, shows the maximum size of the temporary table. In my case, both are set to 16 MB, although the point we are trying to increase just so will not work overall, MySQL would still be limited.max_heap_table_sizetmp_table_sizetmp_table_sizemax_table_heap_size

Now comes the point: If the temporary tables being created are larger than the limit allowed by these variables, MySQL would have to write them to disk, resulting in extremely poor performance. Our task is now simple: do our best to guess the most accurate data size for temporary tables, and adjust these variables to that limit. However, we would like to warn against absurdity: it is bad to set this limit to 16 GB (provided you have so much RAM) if most of your temporary tables are less than 24 MB in size – you just waste RAM, which could have been used by other queries or parts of the system (e.g. cache).

Conclusion

It is not possible to handle all system variables in one article or even all the important variables in an article if the MySQL documentation itself contains several thousand words. While we have covered some universal variables here, we would encourage you to look at the system variables for the engine you are using (InnoDB or MyISAM).

My most desirable result when writing this article is that you take three things away:

  • MySQL is a typical software that operates within the limits set by the operating system. It is not a mysterious program that does God-white-what and is impossible to tame. Fortunately, it is also not as difficult to understand how it is set up and controlled by its system variables.
  • There is not a single setting that will make your MySQL installation zoom in. You have no choice but to look into your running systems (remember, the optimization comes after The app is in production, not before), make the best guesses and measurements, and live with the reality that it will never be perfect.
  • Optimizing the variables isn’t the only way to optimize MySQL – efficient query writing is another big deal. But the point is, even if you have performed a godlike analysis and set these parameters optimally, you can still bring everything to a standstill.

What is your preferred system variable for voting? 

Comments

Write a Reply or Comment

Your email address will not be published. Required fields are marked *


This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.