MyQCache (MySQL Query Cache Use Analyzer)
Current Version = 1.0.0
MyQCache is a free command-line application that replays a MySQL database query log and gathers information about MySQL cache usage and execution time. MyQCache helps identify specific queries in your code that:
- ...would benefit from the "SQL_NO_CACHE" SELECT option.
- ...are possible candidates for software caching.
- ...are wasting large amounts of your database's time.
- ...are actually using the query cache well.
Long story short, you use MyQCache like this:
- Record a MySQL log full of activity from your application
- Replay the MySQL log through MyQCache.exe
- MyQCache.exe will display cache usage and execution statistics
Documentation Table of Contents
- Prerequisites
- Download
- Installation
- Usage
- Options File
- Procedure
- Results
- Example
- License
- History
- Frequently Asked Questions (F.A.Q.)
Prerequisites
Before proceeding, you should be familiar with the MySQL query cache.
Also, if a phrase like "add a 'log=d:/temp/mysql.log' statement to your MySQL 'my.ini' configuration file and restart the MySQL service" is not immediately clear, this utility is probably not for you.
Required Operating System: Windows 2000, XP or 2003 - MyQCache may work on other operating systems, but I'd be surprised.
Required Software: Microsoft .NET Framework 1.1 - This is probably already installed on your system if you are using Windows XP or 2003. Just the runtime version is required.
http://msdn.microsoft.com/netframework/downloads/updates
Download
MyQCache100.zip (less than 200 KB)
The complete source code for this application (in C#.NET for Windows) is NOT available, but source code for the C# class that does all the interesting work ("QLooker.cs") and its statistics helper ("QueryStat.cs") are included in the distribution.
Installation
Unpack the MyQCache*.zip file into a folder on your hard drive. If you would like to be able to run MyQCache from any command prompt, you will either need to add it to your PATH or copy all included files into a folder that is already in your PATH.
For example, as part of setting up any computer I usually edit my system PATH to include a directory of utilities called "D:\OSOmissions". This allows my to run any utility in this directory from anywhere when I am using a command prompt. So...I simply need to unpack all the files from my "MyQCache100.zip" file into my "D:\OSOmissions" folder.
Usage
Usage: MyQCache.exe [LogFile] {OptionsFile}
...where [LogFile] is the MySQL log file to parse and test
...and {OptionsFile} is an alternate path to the required options file.
(OptionsFile will contain your MySQL hostname and credentials.)
Options File
The required options file ("MyQCache.ini" by default) contains vital information such as the database host, name and credentials.
# Database parameters # If not provided, defaults are "host=localhost", "db=mysql" # "username=root" and "password=" host=testmysql db=mytestdb username=safeuser password=kj2435nfd # The "outformat" parameter controls how query stats are written out # Tags allowed in "outformat" string: # [NExec] - Number of times similar query ran # [SQL] - Similar query SQL # [ExecMSTot] - Total MS required to run queries # [ExecMSMissTot] - Total MS to run queries that that missed cache # [CacheMisses] - Number of missed cache queries (a.k.a. "inserts") # [CacheHits] - Number of missed cache queries (a.k.a. "inserts") # [ExecMSAvg] - Average MS required to run queries # [ExecMSMissAvg] - Average MS to run queries that missed cache # [HitPer] - Cache hit percentage outformat=[NExec]x [ExecMSAvg]ms ([ExecMSMissAvg]ms) [HitPer]% ([CacheHits]:[CacheMisses]) :: [SQL]
Procedure
MyQCache should be used as part of the following procedure:
- Set MySQL to log queries. - This will create the MySQL query log file that MyQCache will consume later. To do this, add a "log=d:/temp/mysql.log" statement to your MySQL "my.ini" configuration file and restart the MySQL service.
- Save your MySQL database. - MyQCache will replay a series of queries; to accurately replay them you should start with the same data. To do this, just make a copy of your database (e.g., "d:\mysql\data\mytestdb") to a location OUTSIDE the "mysql\data" folder (e.g., "d:\mysql\backup\mytestdb").
- Run your application. - This will generate the queries that you will analyze later. I can't tell you how to run your application, but you will eventually want to put it under a production load to get the most accurate picture of query cache usage.
- Restore your MySQL database. - To do this, stop the MySQL server, copy your backup into the "mysql\data" folder and start the MySQL server back up. (You may also want to pull the "log=..." statement from "my.ini" at this time.)
- Run MyQCache. - This is where you will run "c:\> MyQCache mysql.log" or something to that effect. Note that MyQCache will not work against a LIVE (i.e., still being written to) copy of the MySQL query log, so you should either make a copy of the query log or pull the "log=..." statement from "my.ini" first.
- Interpret MyQCache's findings. - MyQCache will display information about each type of query executed. See the next section for more information about these statistics.
Results
Before you can interpret results from MyQCache, you need to understand how it "groups" queries.
Query Grouping
MyQCache tries its best to group together queries whose only difference is a constant or two. The idea behind this is that information about how MySQL generally caches queries "like X" is more valuable than specific information about how MySQL caches query "X".
For example, MyQCache counts...
"SELECT * FROM users WHERE username='fred'"
...and...
"SELECT * FROM users WHERE username='jane'"
...as the same "SELECT * FROM users WHERE username='xxx'" query.Queries with similar "IN (..)" clauses are also grouped together. For example, MyQCache counts...
"SELECT * FROM users WHERE username IN ('fred','jane')"
...and...
"SELECT * FROM users WHERE username IN ('jack')"
...as the same "SELECT * FROM users WHERE IN ('xxx')" query.Grouped queries are for statistic purposes only; the raw queries obtained from the MySQL query log are still sent to the server.
Available Statistics
For each group of queries, MyQCache compiles the following statistics.
- [NExec] - Total number of times a query in this group was run.
- [SQL] - "Group query" SQL. Will often contain 'xxx' in place of constants.
- [ExecMSTot] - Total milliseconds required to run queries Exposed really only to let you check MyQCache's work; it's an obvious component of the related average.
- [ExecMSMissTot] - Total milliseconds required to run queries that did not use cache. Exposed really only to let you check MyQCache's work; it's an obvious component of the related average.
- [CacheMisses] - Number of missed cache queries (a.k.a. "inserts").
- [CacheHits] - Number of queries that "hit" an existing result. You want to see many of these.
- [ExecMSAvg] - Average milliseconds required to run queries.
- [ExecMSMissAvg] - Average milliseconds required to run queries that did not use cache. When this value is larger than [ExecMSAvg], the query cache is helping to reduce the amount of time this query generally takes to run.
- [HitPer] - Percentage of queries that hit the cache. Higher percentages are usually better. This value will be "na" if the related query was "uncacheable" (e.g., it was an "INSERT" query).
In addition to these, four additional statistics will be printed at the end of a MyQCache run: total queries executed, total time to execute all queries, total queries that "hit" the cache (returned a previously cached result) and total queries that "missed" the cache (inserted a new result in the cache) .
Interpreting the Results
Generally speaking, you want to see more cache hits than cache misses. The trick is in correctly weeding out ("SQL_NO_CACHE"-ing) queries that currently drop a lot of entries in the cache but rarely pull results out of the cache.
The three numbers I pay attention to are [NExec], [ExecMSMissAvg] and [HitPer]. The queries I really want to work well are those with a large [NExec] (this means they are run frequently) and those with a large [ExecMSMissAvg] (this means they take a while to run).
Here are my rules of thumb:
- If [HitPer] for a particular query group is very low (<15%?), I should consider adding "SQL_NO_CACHE" to the related query(ies) in my code. I should also go back to the MySQL query log and see there are any nearby inserts or updates performed on the same tables that could be wiping my query caches prematurely.
- If [HitPer] for a particular query group is very, very high (>98%?), I may consider caching the related values in code rather than hitting the database.
Other information can be obtained from the results, including:
- Which queries use "unprotected" constants (e.g., "WHERE A=1"). (MyQCache will not group queries with unprotected constants.)
- How much CPU time query caches save you, by query group. (Compare [ExecMSMissTot] with [ExecMSTot].)
Limitations
As MySQL is the first to admit, the MySQL query log is not a perfect representation of the ORDER in which queries were executed (although it is usually quite close). In other words MyQCache may execute some queries in a different order than they were originally executed because that is the order that MySQL recorded them in the query log.
MyQCache will ignore "constants" that are NOT surrounded by single or double quotations marks. (e.g., queries with "WHERE A=1" and "WHERE A=2" will be treated as two separate groups of queries.)
The "milliseconds" values are not 100% accurate at very small values. In other words, "0" frequently means "a little more than zero, but still pretty fast".
Example
We have a simple application that executes only five different kinds of queries. We want to figure out which, if any, of these queries would benefit from dodging the query cache. We follow the procedure to grab a MySQL debug log and end up with one that contains the following queries (the actual log would also have other information).
SELECT * FROM sessions WHERE ID='34'; SELECT * FROM users WHERE username='fred'; SELECT css FROM styles WHERE ID='12'; UPDATE sessions SET lasttouch=Now() WHERE ID='34'; SELECT * FROM sessions WHERE ID='34'; SELECT * FROM users WHERE username='fred'; SELECT css FROM styles WHERE ID='12'; UPDATE sessions SET lasttouch=Now() WHERE ID='34'; SELECT * FROM sessions WHERE ID='34'; SELECT * FROM users WHERE username='fred'; SELECT css FROM styles WHERE ID='12'; UPDATE users SET flag10='1' WHERE username='fred'; UPDATE sessions SET lasttouch=Now() WHERE ID='34'; SELECT * FROM sessions WHERE ID='34'; SELECT * FROM users WHERE username='fred'; SELECT css FROM styles WHERE ID='12'; UPDATE sessions SET lasttouch=Now() WHERE ID='34'; SELECT * FROM sessions WHERE ID='33'; SELECT * FROM users WHERE username='jane'; SELECT css FROM styles WHERE ID='12'; UPDATE sessions SET lasttouch=Now() WHERE ID='33'; SELECT * FROM sessions WHERE ID='34'; SELECT * FROM users WHERE username='fred'; SELECT css FROM styles WHERE ID='12'; UPDATE sessions SET lasttouch=Now() WHERE ID='34'; SELECT * FROM sessions WHERE ID='33'; SELECT * FROM users WHERE username='jane'; SELECT css FROM styles WHERE ID='12'; UPDATE sessions SET lasttouch=Now() WHERE ID='33'; SELECT * FROM sessions WHERE ID='33'; SELECT * FROM users WHERE username='jane'; SELECT css FROM styles WHERE ID='12'; UPDATE sessions SET lasttouch=Now() WHERE ID='33'; SELECT * FROM sessions WHERE ID='33'; SELECT * FROM users WHERE username='jane'; SELECT css FROM styles WHERE ID='12'; UPDATE sessions SET lasttouch=Now() WHERE ID='33'; SELECT * FROM sessions WHERE ID='34'; SELECT * FROM users WHERE username='fred'; SELECT css FROM styles WHERE ID='12'; UPDATE sessions SET lasttouch=Now() WHERE ID='33';We run the query log through MyQCache and get the following results. Remember, we are using the default "outformat" of:
[NExec]x [ExecMSAvg]ms ([ExecMSMissAvg]ms) [HitPer]% ([CacheHits]:[CacheMisses]) :: [SQL]10x 20ms (20ms) 0% (0:10) :: SELECT * FROM sessions WHERE ID='xxx' 10x 15ms (40ms) 70% (7:3) :: SELECT * FROM users WHERE username='xxx' 10x 2ms (10ms) 90% (9:1) :: SELECT css FROM styles WHERE ID='xxx' 10x 45ms (45ms) na% (0:0) :: UPDATE sessions SET lasttouch=Now() WHERE ID='xxx'; 1x 75ms (75ms) na% (0:0) :: UPDATE users SET flag10='xxx' WHERE username='xxx'; Executed 41 queries in 895ms with 16 queue cache hits and 14 inserts/misses.We should probably interpret these results like this:
- Our "SELECT * FROM sessions..." query never appears ("0%") to hit a result in the cache. We should probably add a "SQL_NO_CACHE" option to that query.
- Our "SELECT * FROM users..." query performs very well ("70%") under query caching.
- Our "SELECT css FROM styles..." query also performs very well ("90%") under query caching. If we continue this experiment and cache usage climbs even higher we may want to consider caching this value in memory or the session (and only looking for it at signon, for example).
- Our two "UPDATE..." queries don't use the query cache ("na%"). This is, of course, perfectly normal.
License
This software, and all accompanying files, data and materials, are distributed "AS IS" and with no warranties of any kind, whether express or implied.
History
I've been using MySQL on both Windows and Linux platforms since about 1999. About two years into my work with MySQL query caches, I was getting annoyed by the lack of real-time information about which specific queries used the query cache and which ones did not. To finally answer my own questions, I wrote this utility.
I wrote it at home in about two hours on my own free time, so I decided just to publish the utility and the doc on the web.
Frequently Asked Questions (F.A.Q.)
Q: How do I report a bug?
A: Drop me an email (see contact information at bottom of page). However, I can't promise I'll fix your bug.Q: Is support or commercial licensing of any kind available?
A: Sorry, but no.Q: Do you plan on writing a more platform-independent or Linux version?
A: Again, sorry, but no. However, you are free to port the interesting bits from the source snippet I provided.Q: Do you plan on writing a GUI or web-based version?
A: No.Q: Why isn't the complete source code available?
A: To crank this utility out in only two hours, I wrote "plumbing" code (e.g., database connection code) that would strike the casual observer as similar to code I wrote not too long ago in my work-for-hire position. To avoid any potential work-self conflicts, I omitted the "similar" code from the distribution. However, I believe any reasonably competent programmer should be able to create substitute "plumbing" classes based on behavior observed in the code snippet I do provide.