NicJ.net

MySQL: TEXT vs. VARCHAR Performance

Starting with MySQL 5.0.3, the maximum field length for VARCHAR fields was increased from 255 characters to 65,535 characters.  This is good news, as VARCHAR fields, as opposed to TEXT fields, are stored in-row for the MyISAM storage engine (InnoDB has different characteristics).  TEXT and BLOB fields are not stored in-row — they require a separate lookup (and a potential disk read) if their column is included in a SELECT clause.  Additionally, the inclusion of a TEXT or BLOB column in any sort will force the sort to use a disk-based temporary table, as the MEMORY (HEAP) storage engine, which is used for temporary tables, requires.  Thus, the benefits of using a VARCHAR field instead of TEXT for columns between 255 and 65k characters seem obvious at first glance in some scenarios: potentially less disk reads (for queries including the column, as there is no out-of-row data) and less writes (for queries with sorts including the column).

Following a review of my MySQL server instance’s performance metrics (mysqltuner.sh and mysql tuning-primer.sh point out interesting things), I found that approximately 33% of the server’s temporary tables used on-disk temporary tables (as opposed in-memory tables).  On-disk temporary tables can be created for several reasons, most notably if the resulting table will be larger than the minimum of MySQL’s tmp_table_size/max_heap_table_size variables OR when a TEXT/BLOB field is included in the sort.

Since any operation that uses the disk for sorting will be noticeably slower than using RAM, I started investigating why so many of my temporary tables went to disk.  One of my website’s most heavily used MyISAM tables contained two TEXT columns, which were textual descriptions of an object, submitted by the website’s visitors.  The table schema was created prior to MySQL v5.0.3, when VARCHAR fields were limited to 255 characters or less.  For this object, I wanted to allow descriptions larger than 255 characters, so the table utilized the two TEXT fields to store this data.

So my first thought in reducing the number of on-disk temporary tables was to convert these TEXT fields to VARCHAR.  I sampled the column’s values, and found that the two fields’ maximum input sizes were currently around 8KB.  Not thinking too much about what size I wanted to support, I decided that I could set both fields to VARCHAR(30000) instead of TEXT.  I changed the fields, verified through automated tests that everything still worked, and called it a night.

Over the next two days, I noticed that there were several alarming metrics trending the wrong way.  I utilize Cacti to monitor several hundred server health metrics — it is great for showing trends and alerting about changes.  Unfortunately, it was reporting that server load, page load time and disk activity were all up significantly — especially disk writes.  Wondering if it was a fluke, I left the server alone for another day to see if it would subside, but the high load and disk writes continued consistently.  The load was causing real user-perceived performance impacts for the website’s visitors, causing average page load time to increase from 70ms to 470ms.

Here’s what the Cacti graphs looked like:

Wouldn’t you be alarmed?

Not wanting to run an intensive performance review or diagnostics on the live website, I came up with a plan for how I would diagnose what the problem was:

  1. Enable temporary lightweight tracing on the server to try to determine the source of the increased disk activity (iotop or dstat).
  2. If MySQL is causing most of the disk activity, temporarily revert the VARCHAR(30000) columns back to TEXT, as I suspected they were somehow the cause of the slowdown.
  3. Perform a deeper analysis of the problem on a separate machine.

Running iotop on the server confirmed that a majority of the disk writes were coming from MySQL.  Additionally, after I reverted the columns to TEXT, the server load and page load times went back to normal.

So why did my seemingly obvious “performance win” end up as a performance dud?  Shame on me for not testing and verifying the changes before I pushed them live!

I didn’t want to do any more diagnosis of the problem on the live Linux server — there’s no point in punishing my visitors.  I have a separate Linux development server that I could have used to dig a little deeper, but I’m more comfortable doing performance analysis on Windows, and luckily, MySQL has Windows ports.

For almost all of my performance analysis work, I use the excellent Windows Performance Tools (WPT) Kit (using ETW and xperf).  If you haven’t used the WPT and xperf tools before, there are some good guides on trace capturing using ETW and visual analysis via xperfview on MSDN.  ETW is a lightweight run-time tracing environment for Windows.  ETW can trace both kernel (CPU, disk, network and other activity) and user mode (application) events, and saves them to a .ETL file that can later be processed.  The Windows Performance team and many other teams in Windows regularly use ETW/xperf for a majority of their performance analysis.

To figure out what’s going on with our VARCHAR columns, I first needed to ensure that I could replicate the problem on my Windows MySQL machine.  I installed MySQL 5 and loaded a snapshot of my live database into the instance.

I then looked at the MySQL Slow Query Log from the real server to see what queries were taking a long time (>1 second).  There were thousands of instances of a query that looked something like this:

SELECT   t2.*
FROM     table1 t1, table2 t2
WHERE    t2.t1id = t1.id
ORDER BY t1.id
LIMIT    0, 10

Which looks innocent enough, but Table2 is the table that I changed the two TEXT fields to VARCHAR, and I’m querying all of the columns (SELECT *) from it.  Before, because of the TEXT column, this query would’ve used an on-disk temporary table for the results (because the MySQL manual tells us this is the case for results that need temporary tables and have TEXT columns).  So why is this query appearing to be so much slower now?

First of all, I checked how this query responded on my Windows MySQL instance:

mysql> SELECT t2.*
FROM table1 t1, table2 t2
WHERE t2.t1id = t1.id
ORDER BY t1.id
LIMIT 0, 10;
...
10 rows in set (1.71 sec)

This confirmed the issue appeared on my development machine as well!  This query should be nearly instantaneous (less than 50ms), and 1,710 milliseconds is a long time to wait for a single query of many in a page load.

My guess at this point was it had something to do with temporary tables.  And disks.  Since that was what I was trying to improve with my TEXT to VARCHAR change, it only makes sense that I somehow made it worse.  So to validate this theory, I enabled a bit of lightweight ETW tracing to see how the OS handled the query.

I started ETW tracing to get basic CPU and disk IO information:

xperf -on base+fileio

Then I re-ran the above query in the MySQL command line, and saved the trace to disk (trace.etl) after the query finished:

xperf -d trace.etl

Loading the trace up in xperfview showed some interesting things:

xperfview trace.etl

From a CPU usage perspective, the CPU (on my quad-core system) seemed to be pretty active, but not 100% utilized.  Looking at the CPU Sampling Summary Table, I found that mysqld-nt.exe was using 1,830ms / 17% of my CPU (essentially 68% of one of the cores).  Not bad, but not maxed out either.  But what’s interesting here was the disk utilization graph.  For a period of ~700ms, we’re 100% utilized.  Highlighting that region and viewing the summary table showed where we spent our time:

#sql_1a00_0.MYD is a temporary table from MySQL (which can be confirmed from the File IO graph).  In this case, our single query caused 38MB of disk writes and ~626ms to write/read it back in.

Huh?

At this point, I wanted to double-check that the TEXT to VARCHAR change caused this.  I updated the column to TEXT, and re-run the same query:

mysql> SELECT t2.*
FROM table1 t1, table2 t2
WHERE t2.t1id = t1.id
ORDER BY t1.id
LIMIT 0, 10;
...
10 rows in set (0.03 sec)

Well, 0.03 seconds is a lot faster than 1.71 seconds.  This is promising.  I took another ETW trace of the query on with the TEXT field:

After switching back to TEXT fields, mysql used ~30ms of CPU and caused no disk activity.

Now that I knew what was causing the slowdown, I wanted to try to fully understand why this was the case.  Remember, I started down this path originally because I found that I had a high portion of temporary tables were on-disk temporary tables.  In the interest of seeing less disk activity on my server, I attempted to change several TEXT columns (which can cause on-disk temporary tables) to VARCHAR(30000) columns.  However, I didn’t fully look into what was causing the on-disk temporary tables, and instead just guessed.  As a result, my server’s perf tanked!

Now’s a good time to review the first paragraph of this post.  There are several reasons MySQL may use an internal temporary table.  One interesting quote:

Such a [temporary] table can be held in memory and processed by the MEMORY storage engine, or stored on disk and processed by the MyISAM storage engine.

and

If an internal temporary table is created initially as an in-memory table but becomes too large, MySQL automatically converts it to an on-disk table. The maximum size for in-memory temporary tables is the minimum of the tmp_table_size and max_heap_table_size values. This differs from MEMORY tables explicitly created with CREATE TABLE: such tables, the max_heap_table_size system variable determines how large the table is permitted to grow and there is no conversion to on-disk format.

So a temporary table can start out as a MEMORY table, then if MySQL realizes it’s too big for tmp_table_size and max_heap_table_size, it may convert it to a MyISAM table on the disk.  One caveat with the MEMORY engine is:

MEMORY tables use a fixed-length row-storage format. Variable-length types such as VARCHAR are stored using a fixed length.

How did this behavior affect me, and cause the 30ms query to take 1,710 ms?  Well let’s start out with the two newly VARCHAR(30000) columns.  In a normal MyISAM table, with a dynamically sized row, these two columns only take as much space as the data they contain (plus 1 byte).  That is, if I had a row and these two columns only had 10 bytes of data in them, the row size would be (10+1)*2+[other columns].  However, if I happened to convert this MyISAM table to use fixed-length rows, or I was using the MEMORY storage engine, the row size would be 30000*2+[other columns].  Currently, according to my dataset, these dynamically sized rows only required an average of 1,648 bytes per row.

And that’s the crux of the problem.  My query above, simple enough, requires a temporary table to do its work.  We can verify this via the MySQL EXPLAIN command:

mysql> EXPLAIN SELECT t2.*
FROM table1 t1, table2 t2
WHERE t2.t1id = t1.id
ORDER BY t1.id
LIMIT 0, 25;
+----+-------------+-------+--------+---------------+---------+---------+------
| id | select_type | table | key     | rows | Extra                           |
+----+-------------+-------+--------+---------------+---------+---------+------
|  1 | SIMPLE      | t2    | NULL    | 8813 | Using temporary; Using filesort |
|  1 | SIMPLE      | t1    | PRIMARY | 1    | Using index                     |
+----+-------------+-------+--------+---------------+---------+---------+------

(I trimmed a couple columns to fit to the page’s width).

Here, we see our t2 table Using temporary.  MySQL converted the 8,813 dynamic-row columns to fixed-length, which expanded the VARCHARS to their full size: approximately 60,600 byes per-row.  That’s 8,813 rows * 60,600 bytes = 534,067,800 bytes to deal with!  The server’s tmp_table_size variables decided this wasn’t good for an in-memory temporary table, so MySQL ended up moving a lot of this work to disk.  As a result, we had ~700ms of disk writes with this query when using VARCHAR(30000) columns.

There are a couple ways to avoid this behavior in MyISAM tables:

  1. Use TEXT fields, with their known caveats.
  2. Use a smaller, more reasonable VARCHAR size.  These fields probably don’t need to hold more than 10k of data.  One could reduce their size to 10k or smaller, or even move them to another table and only retrieve them when necessary.
  3. Fiddle with the tmp_table_size and max_heap_table_size variables.  These two variables dictate which queries use on-disk temporary tables, as described here.  They are set at approximately 35mb/16mb by default (on my Windows MySQL 5.1 instance).

I made a couple changes.  I changed one of the original 30k fields to 10k, and and changed the other one to 1k.  This reduced the potential row size in MEMORY temporary tables tremendously.  I also upped the tmp_table_size and max_heap_table_size variables to 128MB on my server.  The combination of these two changes ensured that the specific query above was no longer causing all of the performance issues (for now).  I should probably move the 10k field to another table (or back to TEXT) to be sure.

I probably didn’t need to use ETW and xperf here to look into things.  Since I was aware that the changes I made to the database had a high correlation with the slow-downs I was seeing, and reverting these changes fixed the issue, I could have probably figured out what was going on by reading the MySQL manual a bit more.  But I like to dig into things, and I think xperf can help visually communicate what’s going on with your system for problems like this.  Seeing how much blocking IO a single query can cause really sheds light on things!

Admittedly, the server MySQL is running on also hosts a web server and multiple sites.  A dedicated SQL server with fast, dedicated disks would help minimize problems like this.

One interesting note is that MySQL version 5.1 on Windows doesn’t have the same IO patterns as version 5.0 did – I see the same File IO for temporary tables with the VARCHAR fields, but not the same amount of disk activity.  This could mean that 5.1 memory maps or caches a lot of the on-disk temporary file and doesn’t end up actually writing it out to disk.  I am unsure why.

And again, InnoDB has different performance characteristics.  I did not do any testing on InnoDB.

So at the end of the day, my server is back to normal, and I’ve learned a little bit more about how MySQL works.  I hope you did too.

Share this: