Database performance monitoring is something every DBA worth their salt should be doing on a regular basis.
It should be adopted as a proactive task to help identify issues early on before they become too serious and be part of a post code deployment monitoring process.
Bundled in with linux based operating systems are a heap of great tools that you can use as a DBA to help performance monitor your database server. If you are not happy with what you get “out of the box”, you can also find some great database monitoring tools online that are available to download for free.
For this post, I’m going to talk about both MySQL and Linux operating system performance monitoring tools. In many scenarios, you’ll need both types in order to get a complete understanding of where the delays are in your system.
MySQL Performance Monitoring Tools
1/ MySQL slow query log
The mysql slow query log is absolutely brilliant for capturing slow queries hitting your MySQL databases.
You can log queries whose durations match the number you specify in my.cnf. So you can analyze queries which take more than 3 seconds for example.
Activate in my.cnf with customizable settings for log location, long query time and whether to log queries that do not use any indexes.
#slow query logging slow-query-log = 1 slow-query-log-file = /var/log/mysql/slow-log long-query-time = 3 log-queries-not-using-indexes = 0
Once you have been logging for a while you can aggregate the results with the mysqldumpslow utility, optimize them and then monitor for improvements! 🙂
2/ MySQL Performance Schema
Introduced in version 5.5, the performance_schema database provides a way of querying internal execution of the server at run-time.
To enable add “performance_schema” to my.cnf
There are many objects to query, too many to talk about in this post. Check out the documentation here.
3/ The MySQL process list
To get an idea of how many processes are connected to your MySQL instance, what they are running and for how long, you can run SHOW FULL PROCESSLIST or alternatively read from the information_schema.processlist table.
mysql> SELECT user, host, time, info FROM information_schema.processlist; +-------------+------------+-------+-------------------------------------------------------------------+ | user | host | time | info | +-------------+------------+-------+-------------------------------------------------------------------+ | root | localhost | 0 | SELECT user, host, time, info FROM information_schema.processlist | | replication | srv1:46892 | 11843 | NULL | +-------------+------------+-------+-------------------------------------------------------------------+ 2 rows in set (0.00 sec)
4/ mtop
I love this utility, it provides a real-time view of the MySQL process list and updates according to the number of seconds your specify when you run it.
What I really like about it is that you can have it running on one screen and as problems occur, the colours of the threads change colour with red indicating that something has been running for some time.
There is a great article here about how to install it on different flavours of Linux as well as some detail on how to run it.
5/ SHOW STATUS
Like other command line tools, such as SHOW PROCESSLIST, you run these to get moment in time reports on different variable status’s.
For example, if you want to get information about the query cache, you can run :
mysql> SHOW STATUS LIKE 'Qcache{3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}'; +-------------------------+------------+ | Variable_name | Value | +-------------------------+------------+ | Qcache_free_blocks | 9353 | | Qcache_free_memory | 93069936 | | Qcache_hits | 9719103977 | | Qcache_inserts | 1451857238 | | Qcache_lowmem_prunes | 897050960 | | Qcache_not_cached | 222234089 | | Qcache_queries_in_cache | 20856 | | Qcache_total_blocks | 52497 | +-------------------------+------------+ 8 rows in set (0.00 sec)
This type of reporting can help you monitor specific areas of your MySQL instance. For example, if you wanted to know the query cache hit rate, you could get the numbers from above and calculate based on this formula:
((Qcache_hits/(Qcache_hits+Qcache_inserts+Qcache_not_cached))*100)
For more information, see this link.
Operating System Performance Monitoring Tools
6/ TOP
This will list running processes and the resources that they are consuming. It updates real-time and you can quickly gage if there are processes which are consuming large areas of resource in CPU and memory at a very high level.
top - 17:33:48 up 7 min, 1 user, load average: 0.03, 0.04, 0.04 Tasks: 64 total, 1 running, 63 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0{3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}us, 0.0{3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}sy, 0.0{3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}ni,100.0{3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}id, 0.0{3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}wa, 0.0{3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}hi, 0.0{3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}si, 0.0{3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}st Mem: 604332k total, 379280k used, 225052k free, 11724k buffers Swap: 0k total, 0k used, 0k free, 135064k cached PID USER PR NI VIRT RES SHR S {3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}CPU {3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}MEM TIME+ COMMAND 809 tomcat7 20 0 1407m 149m 13m S 0.3 25.4 0:10.99 java 1153 ubuntu 20 0 81960 1592 756 S 0.3 0.3 0:00.01 sshd 1318 root 20 0 17320 1256 972 R 0.3 0.2 0:00.07 top 1 root 20 0 24340 2284 1344 S 0.0 0.4 0:00.39 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/0 4 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0 5 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kworker/u:0
7/ free
This utility helps to give you an idea whether you have a memory issue. Again this is another great tool for getting a high level view. I like to use “free -m” as it returns the numbers to me in megabytes instead of bytes. The information returned shows you in use, free and swap usage. It also shows what is in use by the kernel and buffers.
root@vm1:~# free -m total used free shared buffers cached Mem: 590 373 216 0 11 131 -/+ buffers/cache: 229 360 Swap: 0 0 0
8/ vmstat
This utility is very useful for monitoring many areas of the system, CPU, IO blocks and swap. I find it particularly good to monitor swap file usage.
Whilst “free” might tell you if there are any pages in the swap file, vmstat will tell you if your system is actively swapping. Computers and servers do need to use their swap file but the less this happens, the better it is for your applications performance.
When you have a problem with swap, it is when it is being used constantly and can be a sign that you don’t have enough memory installed in your system.
By default, running vmstat will not give you a real time view of your system. So you need to add a figure to the command to give you a fresh read out in the number of seconds specified. In this example, I am specifying every 2 seconds.
root@vm1:~# vmstat 2 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 221324 12556 135252 0 0 93 19 40 75 1 0 98 0 0 0 0 221324 12556 135276 0 0 0 0 34 65 0 0 100 0 0 0 0 221324 12564 135280 0 0 0 24 38 64 0 0 100 0 0 0 0 221324 12564 135280 0 0 0 0 32 56 0 0 100 0 0 0 0 221324 12564 135280 0 0 0 0 33 56 0 0 100 0 0 0 0 221324 12564 135280 0 0 0 0 30 55 0 1 100 0 0 0 0 221324 12564 135280 0 0 0 0 35 59 0 0 100 0
The columns you are interested in are swap si and so. Which stands for “swap in” and “swap out”. These figures tell you what is being read in from disk swap file (si) and what is being swapped out to the swap file (so). Swapping is very slow I/O intensive process and you want to be doing some optimization somewhere or adding more memory if this is a problem.
Run “man vmstat” for a full list of features and documentation.
9/ sar
I love sar! It will capture you a whole bunch of metrics based on CPU time, CPU queues, RAM, IO and network activity. It will give you a point in time view of the resource usage in the form of a historical report.
The default time between report lines is 10 minutes but you can change that. It’s great for seeing whether you have any particularly heavy areas of resource pressure at any time in the day. You can also use it as a performance monitoring tool to measure the effects of optimizations to your system.
Some examples, run “man sar” for a full list of features and documentation on what each column header means.
sar -q (check CPU queue length)
11:20:01 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 11:30:01 AM 1 201 0.00 0.00 0.00 11:40:01 AM 1 200 0.00 0.00 0.00 11:50:01 AM 1 201 0.00 0.00 0.00 12:00:01 PM 2 201 0.00 0.00 0.00
sar -r (check RAM usage)
11:20:01 AM kbmemfree kbmemused {3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}memused kbbuffers kbcached kbcommit {3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}commit 11:30:01 AM 151308 3765480 96.14 91416 1054136 2961684 49.25 11:40:01 AM 151076 3765712 96.14 91664 1054136 2961012 49.24 11:50:01 AM 150680 3766108 96.15 91888 1054148 2961152 49.24 12:00:01 PM 150704 3766084 96.15 92104 1054152 2961340 49.24
10/ iostat
This tool will you give you statistics for CPU and I/O for devices, partitions and network file systems. Great for knowing where the busiest drives are for example.
root@vm1 ~# iostat Linux 2.6.32-431.11.2.el6.x86_64 (vm1) 06/27/2014 _x86_64_ (4 CPU) avg-cpu: {3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}user {3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}nice {3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}system {3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}iowait {3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}steal {3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}idle 0.23 0.00 0.07 0.10 0.00 99.60Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 11.78 785.38 450.12 1437054564 823620760 dm-0 1.00 1.35 6.67 2472280 12211040 dm-1 64.52 783.30 441.42 1433252442 807699512 dm-2 0.00 0.00 0.02 7658 29336 dm-3 0.27 0.53 2.01 978626 3680440
Finally
So there you have it – 10 really useful tools which you can utilize in your database performance monitoring efforts. There are many more but I’ve run out of time now. 🙂
Leave a Reply