Archive

Archive for the ‘Tech’ Category

Auto-ban website spammers via the Apache access_log

January 24th, 2012

During the past few months, several of my websites have been the target of some sort of SPAM attack.  After my getting alerted that my servers were under high load (from Cacti), I found that a small number of IP addresses were loading and re-loading or POSTing to the same pages over and over again.  In one of the attacks, they were simply reloading a page several times a second from multiple IP addresses.  In another attack, they were POSTing several megabytes of data to a form (which spent time validating the input), several times a second. I’m not sure of their motives – my guess is that they’re either trying to game search rankings (the POSTings) or someone with an improperly configured robot.

Since I didn’t have anything in-place to automatically drop requests from these rogue SPAMmers, the servers were coming under increasing load and causing real visitor’s page loads to slow down.

After looking at the server’s Apache’s access_log, I was able to narrow down the IPs causing the issue.  With their IP, I simply created a few iptables rules to drop all packets from their IP addresses. Within a few seconds, the load on the server returned to normal.

I didn’t want to play catch-up the next time this happened, so I created a small script to automatically parse my server’s access_logs and auto-ban any IP address that appears to be doing inappropriate things.

The script is pretty simple.  It uses tail to look at the last $LINESTOSEARCH lines of the access_log, grabs all of the IPs via awk, sorts and counts them via uniq, then looks to see if any of these IPs had loaded more than $THRESHOLD pages.  If so, it does a quick query of iptables to see if the IP is already banned.  If not, it adds a single INPUT rule to DROP packets from that IP.

Here’s the code:

 
#!/bin/bash

#
# Config
#

# if more than the threshold, the IP will be banned
THRESHOLD=100

# search this many recent lines of the access log
LINESTOSEARCH=50000

# term to search for
SEARCHTERM=POST

# logfile to search
LOGFILE=/var/log/httpd/access_log

# email to alert upon banning
ALERTEMAIL=foo@foo.com

#
# Get the last n lines of the access_log, and search for the term.  Sort and count by IP, outputting the IP if it's
# larger than the threshold.
#
for ip in `tail -n $LINESTOSEARCH $LOGFILE | grep "$SEARCHTERM" | awk "{print \\$1}" | sort | uniq -c | sort -rn | head -20 | awk "{if (\\$1 > $THRESHOLD) print \\$2}"`
do
    # Look in iptables to see if this IP is already banned
    if ! iptables -L INPUT -n | grep -q $ip
    then
        # Ban the IP
        iptables -A INPUT -s $ip -j DROP
        
        # Notify the alert email
        iptables -L -n | mail -s "Apache access_log banned '$SEARCHTERM': $ip" $ALERTEMAIL
    fi
done

You can put this in your crontab, so it runs every X minutes. The script will probably need root access to use iptables.

I have the script in /etc/cron.10minutes and a crontab entry to run all files in that directory every 10 minutes: /etc/crontab:
0,10,20,30,40,50 * * * * root run-parts /etc/cron.10minutes

Warning: Ensure that the $SEARCHTERM you use will not match a wide set of pages that at web crawler (for example, Google) would see. In my case, I set SEARCHTERM=POST, because I know that Google will not be posting to my website as all of the forms are excluded from crawling via robots.txt.

The full code is also available at Gist.GitHub if you want to fork or modify it. It’s a rather simplistic, brute-force approach to banning rogue IPs, but it has worked for my needs. You could easily update the script to be a bit smarter. If you do, let me know!

Mounting VHDs in Windows 7 from a command-line script

January 4th, 2012

Windows 7 has native support for VHDs (virtual hard disks) built into the OS. VHDs are great for virtual machines, native VHD booting into recent Windows OSs, or even moving whole file systems around.

While you can mount VHDs from the Windows 7 diskmgmt.msc GUI, or via vhdmount, if you need support for mounting or unmounting VHDs from the command-line on a vanilla Windows 7 / Server 2008 install, you have to use diskpart.

diskpart’s mount commands are pretty simple:

C:\> diskpart
DISKPART> sel vdisk file="[location of vhd]"
DISKPART> attach vdisk

Unmounting is just as simple:

C:\> diskpart
DISKPART> sel vdisk file="[location of vhd]"
DISKPART> detach vdisk

These commands work fine on an ad-hoc basis, but I had the need to automate loading a VHD from a script.  Luckily, diskpart takes a single parameter, /s, which specifies a diskpart “script”.  The script is simply the command you would have typed in above:

C:\> diskpart /s [diskpart script file]

I’ve created two simple scripts, MountVHD.cmd and UnmountVHD.cmd that create a “diskpart script”, run it, then remove the temporary file.  This way, you can simply run MountVHD.cmd and point it to your VHD:

C:\> MountVHD.cmd [location of vhd] [drive letter - optional]

Or unmount the same VHD:

C:\> UnMountVHD.cmd [location of vhd]

These files are hosted at Gist.Github if you want to use them or contribute changes.

Backing up Windows computers to a Synology NAS via SSH and rsync

January 4th, 2012

I recently purchased a Synology DS1511+ to act as a NAS (network attached storage) for my home network. The 5-drive, Linux powered device is beautiful – small, sleek and quiet. What sold me was the amazing web-based configuration interface they provide, and the ability to access the device remotely via the web or from mobile apps Synology provides in the iTunes App Store and Android Market.

After setting it up with a couple 2TB and 3TB drives, I wanted to use the device to backup documents from several Windows computers I manage (my own, my wife’s netbook and my parents’ computers thousands of miles away). Local network backup is pretty easy – you can use the Synology Data Replicator to backup Windows hosts to your Synology on your local network. However, it seemed pretty slow to me, and doesn’t use the highly-optimized rsync protocol for backing up files. Since I was previously using rsync over SSH to a Linux server I run at home, I figured since the Synology was Linux-based, it should be able to do the same.

All it takes is a few updates to the Synology server, and a few scripts on the Windows computers you want to backup to make this work for both computers on your home network as well as any external computers you want to backup, as long as they know the address of the remote server. You can use a dynamic-IP service such as TZO.com or DynDNS.org so your remote Windows clients know how to contact your home Synology.

Once I got it all working, I figured the process and scripts I created could be used by others with a Synology NAS (or any server or NAS running Linux). I’ve created a GitHub repository with the scripts and instructions so you can setup your own secure backup for local and remote Windows computers:

https://github.com/nicjansma/synology-windows-ssh-rsync-backup

Features

  • Uses rsync over ssh to securely backup your Windows hosts to a Synology NAS.
  • Each Windows host gets a unique SSH private/public key that can be revoked at any time on the server.
  • The server limits the SSH private/public keys so they can only run rsync, and can’t be used to log into the server.
  • The server also limits the SSH private/public keys to a valid path prefix, so rsync can’t destroy other parts of the file system.
  • Windows hosts can backup to the Synology NAS if they’re on the local network or on a remote network, as long as the outside IP/port are known.

NOTE: The backups are performed via the Synology root user’s credentials, to simplify permissions. The SSH keys are only valid for rsync, and are limited to the path prefix you specify. You could change the scripts to backup as another user if you want (config.csv).

Synology NAS Setup

  1. Enable SSH on your Synology NAS if you haven’t already. Go to Control Panel – Terminal, and check “Enable SSH service”.
  2. Log into your Synology via SSH.
  3. Create a /root/.ssh directory if it doesn’t already exist
    mkdir /root/.ssh
    chmod 700 /root/.ssh
  4. Upload server/validate-rsync.sh to your /root/.ssh/validate-rsync.sh. Then chmod it so it can be run:
    chmod 755 /root/.ssh/validate-rsync.sh
  5. Create an authorized_keys file for later use:
    touch /root/.ssh/authorized_keys
    chmod 600 /root/.ssh/authorized_keys
  6. Ensure private/public key logins are enabled in /etc/ssh/sshd_config.
    vi /etc/ssh/sshd_config

    You want to ensure the following lines are uncommented:

    PubkeyAuthentication yes
    AuthorizedKeysFile .ssh/authorized_keys
  7. You should reboot your Synology to ensure the settings are applied:
    reboot
  8. Setup a share on your Synology NAS for backups (eg, ‘backup’).

Client Package Preparation

Before you backup any clients, you will need to make a couple changes to the files in the client/ directory.

  1. First, you’ll need a few binaries (rsync, ssh, chmod, ssh-keygen) on your system to facilitate the ssh/rsync transfer. Cygwin can be used to accomplish this. You can easily install Cygwin from http://www.cygwin.com/. After installing, pluck a couple files from the bin/ folder and put them into the client/ directory. The binaries you need are:
    chmod.exe
    rsync.exe
    ssh.exe
    ssh-keygen.exe

    You may also need a couple libraries to ensure those binaries run:

    cygcrypto-0.9.8.dll
    cyggcc_s-1.dll
    cygiconv-2.dll
    cygintl-8.dll
    cygpopt-0.dll
    cygspp-0.dll
    cygwin1.dll
    cygz.dll
  2. Next, you should update config.csv for your needs:
    rsyncServerRemote - The address clients can connect to when remote (eg, a dynamic IP host)
    rsyncPortRemote - The port clients connect to when remote (eg, 22)
    rsyncServerHome - The address clients can connect to when on the local network (for example, 192.168.0.2)
    rsyncPortHome - The port clients connect to when on the local network (eg, 22)
    rsyncUser - The Synology user to backup as (eg, root)
    rsyncRootPath - The root path to back up to (eg, /volume1/backup)
    vcsUpdateCmd - If set, the version control system command to use prior to backup up (eg, svn up)
  3. The version control update command (%vcsUpdateCmd%) can be set to run a version control update on your files prior to backing up. This can be useful if you have a VCS repository that clients can connect to. It allows you to make remote changes to the backup scripts, and have the clients get the updated scripts without you having to log into them. The scripts are updated each time start-backup.cmd is run. For example, you could use this command to update from a svn repository:
    vcsUpdateCmd,svn up

    If you are using a VCS system, you should ensure you have the proper command-line .exes and .dlls in the client/ directory. I’ve used Collab.net’s svn.exe and lib*.dll files from their distribution (http://www.collab.net/downloads/subversion/).

    During client setup, you simply need to log into the machine, checkout the repository, and setup a scheduled task to do the backups (see below). Each time a backup is run, the client will update its backup scripts first.

The client package is now setup! If you’re using %vcsUpdateCmd%, you can check the client/ directory into your remote repository.

Client Setup

For each client you want to backup, you will need to do the following:

  1. Generate a private/public key pair for the computer. You can do this by running ssh-keygen.exe, or have generate-client-keys.cmd do it for you:
    generate-client-keys.cmd

    or

    generate-client-keys.cmd [computername]

    If you run ssh-keygen.exe on your own, you should name the files rsync-keys-[computername]:

    ssh-keygen.exe -t dsa -f rsync-keys-[computername]

    If you run ssh-keygen.exe on your own, do not specify a password, or clients will need to enter it every time they backup.

  2. Grab the public key out of rsync-keys-[computername].pub, and put it into your Synology backup user’s .ssh/authorized_keys:
    vi ~/.ssh/authorized_keys

    You will want to prefix the authorized key with your validation command. It should look something like this

    command="[validate-rsync.sh location] [backup volume root]" [contents of rsync-keys-x.pub]

    For example:

    command="/root/.ssh/validate-rsync.sh /volume1/backup/MYCOMPUTER" ssh-dss AAAdsadasds...

    This ensures that the public/private key is only used for rsync (and can’t be used as a shell login), and that the rsync starts at the specified root path and no higher (so it can’t destroy the rest of the filesystem).

  3. Copy backup-TEMPLATE.cmd to backup-[computername].cmd
  4. Edit the backup-[computername].cmd file to ensure %rsyncPath% is correct. The following DOS environment variable is available to you, which is set in config.csv:
    %rsyncRootPath% - Remote root rsync path

    You should set rsyncPath to the root remote rsync path you want to use. For example:

    set rsyncPath=%rsyncRootPath%/%COMPUTERNAME%

    or

    set rsyncPath=%rsyncRootPath%/bob/%COMPUTERNAME%

    %rsyncRootPath% is set in config.csv to your Synology backup volume (eg, /volume1/backup), so %rsyncPath% would evaluate to this if your current computer’s name is MYCOMPUTER:

    /volume1/backup/MYCOMPUTER

    You can see this is the same path that you put in the authorized_keys file.

  5. Edit the backup-[computername].cmd file to run the appropriate rsync commands. The following DOS environment variables are available to you, which are set in start-backup.cmd:
    %rsyncStandardOpts% - Standard rsync command-line options
    %rsyncConnectionString% - Rsync connection string

    For example:

    set cmdArgs=rsync %rsyncStandardOpts% "/cygdrive/c/users/bob/documents/" %rsyncConnectionString%:%rsyncPath%/documents
    echo Starting %cmdArgs%
    call %cmdArgs%
  6. Copy the client/ directories to the target computer, say C:\backup. If you are using %vcsUpdateCmd%, you can checkout the client directory so you can push remote updates (see above).
  7. Setup a scheduled task (via Windows Task Scheduler) to run start-backup.cmd as often as you wish.
  8. Create the computer’s backup directory on your Synology NAS:
    mkdir /volume1/backup/MYCOMPUTER

The client is now setup!

Source

As noted above, the source for these scripts is available on Github:

https://github.com/nicjansma/synology-windows-ssh-rsync-backup

If you have any suggestions, find a bug or want to make contributions, please head over to GitHub!

Unofficial LEGO Minifigure Catalog App now available in Apple AppStore

December 16th, 2011

Our Unofficial LEGO Minifigure Catalog App was just approved by Apple and is now available in the AppStore!

The Unofficial LEGO Minifigure Catalog App

December 12th, 2011

I’m happy to announce the release of a new project I’ve been working on, The Unofficial LEGO Minifigure Catalog App.  Earlier this year, Dr. Christoph Bartneck released a new book titled The Unofficial LEGO Minifigure Catalog.  The book contains high quality photographs of all 3,600 minifigures released between the 1970s and 2010.  Dr. Barneck also introduces a new nomenclature for identifying and categorizing minifigures.  It’s a great book for LEGO fans, and is available from Amazon.

Since its release, I have been working with Dr. Bartneck on a mobile application that highlights all of the great content in the book.  Today, the app is available in the Android Market, and the iOS version has been submitted for review.  We think the app is a great companion for the book.

Features

  • More than 3650 Minifigures and 650 Heads listed
  • High-resolution photographs of every Minifigure
  • Thousands of LEGO sets listed
  • Browse by theme or year
  • Search by name
  • Manage favorite Minifigures
  • Mark the Minifigures you own
  • Import and export with Brickset.com account
  • Advanced downloading and caching technology
  • Regular updates
  •  

    Screen Shots

    Here are screenshots from an Android device:

     

     

    Availability

    The app is available today in the Android Market:

     

    The iOS version (iPod, iPhone, iPad) will be available as soon as Apple approves it.

    Please let us know what you think!

Amazon S3/CloudFront 304s stripping Cache-Control headers

October 7th, 2011

TL;DR: Beware of relying on Cache-Control: max-age and Expires HTTP header fallback behavior on Amazon CloudFront. The Cache-Control header may get stripped on CloudFront 304s, and browsers will then have to fall back to whatever is in the Expires header. If that Expires date has passed, or if you never specified it, all subsequent requests for the resource will be conditionally validated by the browser.

Update 2011-12-18: The Amazon CloudFront team has fixed the issue!

The Problem

I was looking at my web server’s health metrics recently (via Cacti), and noticed a spike in outbound traffic and HTTP requests. Analytics logs didn’t show a large increase in visitors or page loads, and it looked like the additional traffic was simply an increase in requests for static images.

The Investigation

The static images in question have HTTP Cache headers set for 1 year into the future, so they can be easily cached by browsers and proxies per performance best-practices. The suggested way to set a far expiry date is by setting both the Cache-Control header (eg, Cache-Control: public, max-age=31536000), as well as an Expires header with a static date set for the same time in the future. The Expires header is a HTTP/1.0 header that sets a specific date, say Jan 1 2011, whereas Cache-Control is relative, in seconds. Theoretically, if both Cache-Control and Expires headers are sent, the Cache-Control header should take precedence, so it’s safe to additionally set Expires for fall-back cases.

This combination of caching header behavior works good if you are using Amazon’s CloudFront CDN, backed by static files on Amazon S3, which is what I use for several sites. The files are uploaded once to S3, and their HTTP headers are set at upload time.  For the static images, I am uploading them with a 1-year max-age expiry and an Expires header 1 year from when they’re uploaded. For example, I uploaded an image to S3 on Oct 5 2010 with these headers:

Cache-Control: public, max-age=31536000
Expires: Thu, 05 Oct 2011 22:45:05 GMT

Theoretically, HTTP/1.1 clients (current web browsers) and even ancient HTTP/1.0 proxies should both be able to understand these headers. Even though the Expires header was for Oct 5 2011 (a couple days ago), Cache-Control should take precedence and the content should still be fresh for all current web browsers that recently downloaded the file. HTTP/1.0 proxies will only understand the Expires header, and they may want to conditionally validate the content if the date is past Oct 5 2011, but they should be a small part of HTTP accesses.

So my first thought was that the additional load on the server was from HTTP/1.0 proxies re-validating the already-expired content since I had set the content to expire in 1 year and that date had just passed. I should have set a much-further expiry in the first place — these images never change. To fix this, I could easily just re-upload the content with a much longer Expires (30 years from now should be sufficient).

However, as I was investigating the issue, I noticed via the F12 Developer Tools that IE9 was conditionally validating some of the already-expired images, even though the Cache-Control header should be taking precedence. Multiple images were being conditionally re-validated (incurring a HTTP request and 304 response), for every IE session. All of these images had Expires header date that recently passed.

After I cleared my IE browser cache, the problem no longer repro’d. It was only after I happened to F5 the page (refresh) that the past-Expires images were being conditionally requested again on subsequent navigations.

The Repro

Take, for example, this request of a static file on my webserver that expired back on Jan 1, 2010:

GET /test/test-public.txt HTTP/1.1
Accept: text/html, application/xhtml+xml, */*
Accept-Language: en-US
User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
Accept-Encoding: gzip, deflate
Connection: Keep-Alive
Host: cf.nicj.net
HTTP/1.0 200 OK Date: Sat, 08 Oct 2011 02:28:03 GMT
Cache-Control: public, max-age=946707779
Expires: Fri, 01 Jan 2010 00:00:00 GMT 
Last-Modified: Sat, 08 Oct 2011 02:25:58 GMT
ETag: "098f6bcd4621d373cade4e832627b4f6"
Accept-Ranges: bytes
Content-Type: text/plain
Content-Length: 4
Server: AmazonS3

IE and other modern browsers will download this content today, and treat it as fresh for 30 years (946,707,779 seconds), due to the Cache-Control header taking precedence over the Jan 1, 2010 Expires header.

The problem comes when, for whatever reason, a browser conditionally re-validates the content (via If-Modified-Since). Here are IE’s request headers and Amazon’s CloudFront response headers:

GET /test/test-public.txt HTTP/1.1
Accept: text/html, application/xhtml+xml, */*
Accept-Language: en-US
User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
Accept-Encoding: gzip, deflate
Connection: Keep-Alive
Host: cf.nicj.net
If-Modified-Since: Sat, 08 Oct 2011 02:28:03 GMT
HTTP/1.0 304 Not Modified
Date: Sat, 08 Oct 2011 02:31:54 GMT
Content-Type: text/plain
Expires: Fri, 01 Jan 2010 00:00:00 GMT
Last-Modified: Sat, 08 Oct 2011 02:25:58 GMT
ETag: "098f6bcd4621d373cade4e832627b4f6"
Age: 232

We see the additional If-Modifed-Since in the request, and the same Expires date in the response. Unfortunately, there’s an important missing header in this response: the Cache-Control header. It appears, at least from my testing, that CloudFront strips the Cache-Control headers from 304 responses.

After this happens it appears that IE forgets the original Cache-Control header so all subsequent navigations to the page will trigger conditional GETs for those resources.  Since the 304 is missing the Cache-Control header, it just sees the Expires tag, and thinks it needs to always re-validate the content from now on.

Why This Would Happen

But what’s causing the re-validation (If-Modifed-Since) and subsequent 304 in the first place?

User agents shouldn’t normally re-validate these resources, since the original Cache-Control header should keep it fresh for quite a while.  Except, when you either force a refresh (F5) of the page, or if the content has passed its natural freshness.

On F5 refresh, all resources on the page are conditionally re-validated via If-Modified-Since. And, as we’ve seen, the resources on CloudFront are sent back missing the original Cache-Control header, and IE updates its cache with just the Expires tag, instead of keeping the resource still fresh for a year. For some reason, this doesn’t occur with Chrome or Firefox on F5.

In addition, the problem will appear in all browsers when they need to send a If-Modified-Since header for re-validation of content they think might have expired, such as with max-age headers that have expired (shorter-expiring content).

Take, for example, a resource that you set to expire 1 day from now, and either set the Expires header to 1 day from now (per best practices) or simply don’t specify the Expires header:

Cache-Control: public, max-age=86400

For the first 24 hours after your visitor loads the resource, modern browsers won’t re-validate the resource. At hour 24 and 1 second, the browser will send a conditional request. Unfortunately, with CloudFront, the 304 response will be missing the Cache-Control header.  The browser then doesn’t realize that the resource should be fresh for another 24 hours. So even if the content wasn’t actually updated after those 24 hours, all subsequent navigations with the resource will trigger a conditional validate of that resource, since the original Cache-Control headers were lost with the 304. Ugh.

How to Avoid the Issue

Note this doesn’t appear to affect Chrome 14 and FireFox 6 in the F5 scenario. Both browsers send conditional If-Modified-Since headers on F5 and get back the same CloudFront response (sans Cache-Control headers), but they don’t appear to be affected by the missing Cache-Control header. Subsequent navigations in Chrome and FF after a F5 do not conditionally re-validate the CloudFront content. They do appear to be affected by the missing Cache-Control header for naturally stale content on If-Modified-Since requests.

I haven’t investigated the F5 problem on pre-IE9 versions, but I would assume the problem exists there as well. As far as I can tell, this isn’t fixed in IE10 beta.

I’ve only found this problem on CloudFront’s CDN servers. I couldn’t find a way to get Apache to naturally skip the Cache-Control header for 304s if the header was in the original HTTP 200 response (for example, when using mod_expires on static content).

The bottom line is that requests that send an If-Modified-Since to CloudFront and get a 304 back will essentially lose the Cache-Control hints. If your Expires header is missing, or in the past, the resource will be conditionally validated on every page navigation until it gets evicted from the cache. That can cause a lot of unnecessary requests and will slow down your visitor’s page loads.

The simple solution is to use a much-further expiry time.  30 years should suffice. Then, if the original Cache-Control header is lost from CloudFront 304s, the 30-year-from-now Expires header will keep the resource from having to be validated.

I’m not sure why Amazon CloudFront strips the Cache-Control header from 304 responses. I’ll follow up with them.

Back to my original problem: I think it’s actually Amazon’s CloudFront servers noting that the Expires for a lot of my static images are past-due.  They’re checking the origin server to see if any new content is available.  The above issue isn’t likely causing a ton of additional load, but it was interesting to find none-the-less!

Update 2011-10-12: I’ve opened a thread on the Amazon CloudFront forums here.  The team has responded saying they’re looking into the issue.

Update 2011-12-18: The Amazon CloudFront team has fixed the issue!

TiskTasks for Todoist

August 8th, 2011

Two years ago I gushed over how much I love todoist.com. I still use Todoist every day, for both work and personal tasks. It’s an amazingly simple, clean, user-friendly website that has kept me mentally organized for several years.

Yesterday, I released my second Android app, TiskTasks for Todoist.  TiskTasks attempts to mimic the beautiful simplicity of the todoist.com website in a touch-friendly format.  This is my first paid app ($0.99), and I’m pretty happy with how it’s turned out.

If you’re a Todoist user, please consider TiskTasks.

TiskTasks is available on the Android Market and via the Amazon Android Appstore.

MySQL: Converting an incorrect latin1 column to utf8

April 17th, 2011

I recently stumbled across a major character encoding issue on one of the websites I run.  Through resolving the issue, I learned a lot about the complexities of supporting international character sets in a LAMP (Linux, Apache, MySQL, PHP) environment.  Fixing the problem was a challenge, so I wanted to share some of the knowledge I gained in case anyone else finds similar issues on their own websites.

The post below is a long yet detailed account of my experience.  Let me know if you’ve had similar experiences or found another solution for this type of issue.

Oh My Münchhausen!!!

A couple of days ago I was notified by a visitor of one of my websites that searching for a term with a non-ASCII character in it (in this case, “Münchhausen”) was returning over 500 results, though none of the results actually matched the given search term.

I started looking into the issue, and saw the same thing he was. The debug logs from the search page showed the following SQL query being used:

mysql> SELECT * FROM MyTable WHERE city = 'Münchhausen';
...
(7ms, 532 rows affected)

However, none of the results actually contained “Münchhausen” for the city.

I took the exact same query and ran it in the command-line mysql client. Strangely, this returned a different result:

mysql> SELECT * FROM MyTable WHERE city = 'Münchhausen';
(0ms, 0 rows affected)

The exact same query, run instead from the command line, returned 0 rows.

Assuming this had something to do with the “ü” character, I started a long journey of re-learning what character encodings are all about, including what UTF-8, latin1 and Unicode are, and how they are used in MySQL.

What’s Going On?

latin1, AKA ISO 8859-1 is the default character set in MySQL 5.0. latin1 is a 8-bit-single-byte character encoding, as opposed to UTF-8 which is a 8-bit-multi-byte character encoding. latin1 can represent most of the characters in the English and European alphabets with just a single byte (up to 256 characters at a time). UTF-8, on the other hand, can represent every character in the Unicode character set (over 109,000 currently) and is the best way to communicate on the Internet if you need to store or display any of the world’s various characters.

You can specify a default character set per MySQL server, database, or table. The defaults for a database will get applied to new tables, and the defaults for a table will get applied to new columns. You can change the defaults at any time (ALTER TABLE, ALTER DATABASE), but they will only get applied to new tables and columns.

You can also specify the character set you’re using for client connections (via the command line, or through an API like PHP’s mysql functions).

Getting back to the Münchhausen Problem, one of the things I initially checked was what character set PHP was talking to MySQL with:

mysql>  SHOW VARIABLES LIKE 'character_set_client';
+----------------------+--------+
| Variable_name        | Value  |
+----------------------+--------+
| character_set_client | latin1 |
+----------------------+--------+
1 row in set (0.00 sec)

Knowing the character “ü” is represented differently in latin1 versus UTF-8 (see below), and taking a wild stab in the dark, I tried to force my PHP application to use UTF-8 when talking to the database to see if this would fix the issue:

mysql_set_charset('utf8');

Voila! The problem was fixed! Searching for “Münchhausen” on the site returned 0 results ( the correct number of matches).

The Problem With “ø” Is That It’s Not “ø”

Or was it? A couple minutes later, I was browsing the site and started coming across funky characters everywhere. For example, a page that previously had the text “Graffiti by Dolk and Pøbel” was now reading “Graffiti by Dolk and Pøbel”. These strange character sequences also looked like an issue I had noticed from time to time in phpMyAdmin with edit fields showing strange characters. As long as I didn’t edit the strange characters, they displayed correctly when PHP spit them back out as HTML, so I hadn’t though much of it until now.

Seeing these strange characters sequences everywhere scared me enough to look into the problem a bit more. I disabled the call to mysql_set_charset() and the site reverted to the previous “correct” behavior of talking to the server via latin1 and displaying “Graffiti by Dolk and Pøbel”.

Back to square one.

It’s All Bits From MySQL’s Point of View

Since the term “Münchhausen” was returning inappropriate results, I tried other search terms that contained non-ASCII characters. For example, I searched for the city “São Paulo”:

mysql> SELECT city FROM MyTable WHERE city = 'São Paulo';
+-----------+
| city      |
+-----------+
| Sao Paulo |
| Sao Paulo |
| Sao Paulo |
...
| Sao Paulo |
(2ms, 72 rows affected)

As you can see, the search term kind-of worked. It found occurrences of “Sao Paulo” but not “São Paulo”. I know there are rows with São in the database, so the query wasn’t working 100% correctly.

I changed the query slightly to a wildcard match instead of the non-ASCII character:

mysql> SELECT city FROM MyTable WHERE city LIKE 'S%o Paulo';
+------------+
| city       |
+------------+
| Sao Paulo  |
| Sao Paulo  |
| Sao Paulo  |
...
| São Paulo |
| São Paulo |
(5ms, 348 rows affected)

This search worked a bit better — it found rows with cities of both Sao Paulo and São Paulo. However, it returned the character sequence “ã” for São Paulo for some reason.

The reason for this is, from MySQL’s point of view, the data stored within its tables are all just bits. The column type and character set of a column determine how queries work against the data and how the data is returned as a result of a SELECT query.

It’s probably pretty obvious by now that my city column wasn’t the right character set. It was set to latin1 when the database was created.

The interesting thing is that my web application, which uses PHP, didn’t seem to mind this very much. Storing and retrieving from the city column is binary-safe — that is, MySQL doesn’t modify the data PHP sends it via the mysql extension.   Later, MySQL will give PHP the exact same data (bits) back.  The problems only occur when you ask MySQL to, on its own, analyze the column or present it.

So all this time, my PHP web application had been storing UTF-8-encoded data in the city column, and later retrieving the exact same (binary) data which it display on the website. MySQL doesn’t modify the data for simple UPDATEs and SELECTs, so the UTF-8 characters were all still displayed properly on the website.

To fix the above SQL query, we can actually force MySQL to re-interpret the data as a specific character encoding by first converting the data to a BINARY type then casting that as UTF-8.  If you try to simply CONVERT USING utf8, MySQL will helpfully convert your garbage-latin1 characters to garbage-utf8 characters.  The two-step process of temporarily converting to BINARY ensures that MySQL doesn’t try to re-interpret the column in the other character encoding.  If we don’t convert to BINARY, MySQL would end up displaying the same ã characters even in UTF-8 output.

So we CAST to BINARY temporarily first, then CONVERT this USING UTF-8:

mysql> SELECT city, CONVERT(CAST(city as BINARY) USING utf8) FROM MyTable WHERE city = 'São Paulo';
+------------+-------------+
| city       | CONVERT...  |
+------------+-------------+
| Sao Paulo  | Sao Paulo   |
| Sao Paulo  | Sao Paulo   |
...
| São Paulo | São Paulo   | ...

Success!  We’ve tricked MySQL into giving us the UTF-8 interpretation of our latin1 column on the fly, and we see that São Paulo is represented properly.

Why do they differ?

Some background: Why is “ã” represented differently in latin1 vs UTF-8?

The character ã in latin1 is character code 0xE3 in hex, or 227 in decimal. It’s 8 bits would be represented as:

1110 0011

latin1 is a single-byte encoding, so each of the 256 characters are just a single byte.

The UTF-8 encoding was designed to be backward-compatible with ASCII documents, for the first 128 characters. For characters above #128, a multi-byte sequence describes the character.

Our character ã, #227, misses the single-byte compatibility with ASCII’s first 128 characters and must be represented in two bytes as described on the Wikipedia UTF-8 page.

Here’s a representation of the character ã in both encodings:

   0000 0000 1110 0011 ASCII character (latin1) (0x00E3)
-> 1100 0011 1010 0011 UTF-8 encoding (0xC3A3)

UTF-8 encoding turns our ã, represented as 0xE3 in latin1, into two bytes, 0xC3A3 in UTF-8.

Note that these two bytes 0xC3 and 0xA3 in UTF-8 happen to look like this in latin1:

0xC3 = Ã in latin1

0xA3 = £ in latin1

So the UTF-8 encoding of ã explains precisely why we see it reinterpreted as ã in latin1.

What To Do, What To Do

At this point, it’s obvious that I messed up somewhere. More precisely, the city column should be UTF-8, since PHP has always been putting UTF-8 data in it. The big reason I hadn’t noticed an issue up to this point is that while the MySQL column is latin1, my PHP app was getting this data and calling htmlentities to convert the UTF-8 characters to HTML codes before displaying them. My website’s visitors saw proper UTF-8 characters on the website even though the MySQL column was latin1.

As we’ve seen, issues start occurring when you do queries against the data. The SELECT above was using a UTF-8 character “ü” for “Münchhausen”, and when comparing this to “latin1″ data in the column, MySQL gets confused (can you blame it?). Additional issues can appear with applications that display the “natural” encoding of the column (such as phpMyAdmin): they show the strange character sequences as seen above, instead of UTF-8 decoded characters.

The core of the problem is that the MySQL database was created several years ago and the default collation at the time was latin1_swedish_ci. Over the years, I changed the default to utf8_general_ci for new columns, but existing tables and columns weren’t changed.  I have over 100 tables in latin1 that should be UTF-8 and need to be converted.

So I started investigating what it takes to convert my existing latin1 tables to UTF-8 as appropriate. Once I set the character encoding properly, queries against the database should work better and I shouldn’t have to worry about these types of issues in the future.

There are a couple ways to make the conversion.

Some people have successfully exported their data to latin1, converted the resulting file to UTF-8 via iconv or a similar utility, updated their column definitions, then re-imported that data. Unfortunately this requires taking the database down as tables are dropped and re-created, and this can be a bit time-consuming.

I was hoping for a process that I could apply to an online database, and luckily I found some good notes by Paul Kortman and fabio, so I combined some of their ideas and automated the process for my site. I hit a couple issues along the way, so I wanted to share the steps that worked for me.

The Process

Using the method described on fabio’s blog, we can convert latin1 columns that have UTF-8 characters into proper UTF-8 columns by doing the following steps:

  1. Convert the column to the associated BINARY-type (ALTER TABLE MyTable MODIFY MyColumn BINARY)
  2. Convert the column back to the original type and set the character set to UTF-8 at the same time (ALTER TABLE MyTable MODIFY MyColumn TEXT CHARACTER SET utf8 COLLATE utf8_general_ci)

This is a similar approach to our SELECT CONVERT(CAST(city as BINARY) USING utf8) trick above, where we basically hide the column’s actual data from MySQL by masking it as BINARY temporarily.

If you simply force the column to UTF-8 without the BINARY conversion, MySQL does a data-changing conversion of your “latin1″ characters into “UTF-8″ and you end up with improperly converted data. Converting the column to BINARY first forces MySQL to not realize the data was in UTF-8 in the first place.

We need to convert each source column type (CHAR vs. VARCHAR vs. TEXT, etc) into its associated BINARY type (BINARY vs. VARBINARY vs. BLOB).

Additionally, the MODIFYs to BINARY and back need to retain the entire column definition. For TEXT types, a simple TEXT to BLOB conversion is sufficient. But for column definitions that have specified lengths, defaults or NOT NULL:

MyColumn VARCHAR(255) DEFAULT '' NOT NULL

We need to MODIFY keeping the same attributes, or the column definition will be fundamentally changed (see notes in ALTER TABLE). In this case, we would specify:

mysql> ALTER TABLE MyTable MODIFY MyColumn VARBINARY(255) DEFAULT '' NOT NULL;

If we don’t specify the length, default and NOT NULL, the columns aren’t the same as before the conversion.

The Script

The script at the bottom of this post automates the conversion of any UTF-8 data stored in latin1 columns to proper UTF-8 columns.

I modified fabio’s script to automate the conversion for all of the latin1 columns for whatever database you configure it to look at. It converts the columns first to the proper BINARY cousin, then to utf8_general_ci, while retaining the column lengths, defaults and NULL attributes.

Warning: This script assumes you know you have UTF-8 characters in a latin1 column. Please test your changes before blindly running the script!

Here are the steps you should take to use the script:

Step 1. Determine Which Columns Need Updating

If you’re like me, you may have a mixture of latin1 and UTF-8 columns in your databases.  Not all of the columns in my database needed to be updated from latin1 to UTF-8.  For example, some of the tables belonged to other PHP apps on the server, and I only wanted to update the columns that I knew had to be fixed.  The script will currently convert all of the tables for the specified database – you could modify the script to change specific tables or columns if you need.

Additionally, the script will only update appropriate text-based columns.  Character sets are only appropriate for some types of data: CHAR, VARCHAR, TINYTEXT, TEXT, MEDIUMTEXT and LONGTEXT. Other column types such as numeric (INT) and BLOBs do not have a “character set”.

You can see what character sets your columns are using via the MySQL Administration tool, phpMyAdmin, or even using a SQL query against the information_schema:

mysql> SELECT * FROM COLUMNS WHERE TABLE_SCHEMA = "MyTable" AND COLLATION_NAME LIKE "latin1%";
...
115 rows in set (0.03 sec)

Step 2. Test Convert the Columns

You should test all of the changes before committing them to your database.

The first thing to test is that the SQL generated from the conversion script is correct.  To do this, you can dump the structure of your database:

server> mysqldump -d -h localhost -u dbuser -p mydatabase > structure.sql

And import this structure to another test MySQL database:

server> mysql -u dbuser -p mydatabase_test < structure.sql

Next, run the conversion script (below) against your temporary database:

server> php -f mysql_convert_latin1_to_utf8.php

The script will spit out “!!! ERROR” statements if a change fails.  If you encounter ERRORs, modifications may be needed based on your requirements.  Some of the common problems are listed in Step 3.

After you run the script against your temporary database, check the information_schema tables to ensure the conversion was successful:

mysql> SELECT * FROM COLUMNS WHERE TABLE_SCHEMA = "MyTable";

As long as you see all of your columns in UTF8, you should be all set!

Step 3. Problems You May Encounter

If it were only that simple. I hit some issues along the way.

Step 3.1. FULLTEXT indexes

I have several columns with FULLTEXT indexes on them.  The ALTER TABLE to BINARY command for a column that has a FULLTEXT index will cause an error:

mysql> ALTER TABLE MyTable MODIFY MyColumn BLOB;
ERROR 1283 (HY000): Column 'MyColumn' cannot be part of FULLTEXT index

The simple solution I came up with was to modify the script to drop the index prior to the conversion, and restore it afterward:

ALTER TABLE MyTable DROP INDEX `mycolumn_fulltext`

... (convert all columns) ...

ALTER TABLE MyTable ADD FULLTEXT KEY `mycolumn_fulltext` (`MyColumn`)

There are TODOs listed in the script where you should make these changes.

Step 3.2. Invalid UTF-8 data

Since my database was over 5 years old, it had acquired some cruft over time. I’m not sure exactly how this happened, but some of the columns had data that are not valid UTF-8 encodings, though they were valid latin1 characters. I believe this occurred before I hardened my PHP application to reject non-UTF-8 data, but I’m not sure. I found this out when initially trying to do the conversion:

mysql> ALTER TABLE MyTable MODIFY MyColumn VARBINARY(3000) NOT NULL DEFAULT '';
Query OK, 21171 rows affected (0.66 sec)

mysql> ALTER TABLE MyTable MODIFY MyColumn varchar(3000) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT '';
ERROR 1366 (HY000): Incorrect string value: '\xE2\x80? fl...' for column 'MyColumn' at row 128

What’s going on?

At some point, a character sequence that contained invalid UTF-8 characters was entered into the database, and now MySQL refuses to call the column VARCHAR (as UTF-8) because it has these invalid character sequences.

I checked the HTML representation of this column in my PHP website, and sure enough, the garbage shows up there too:

… �? flown …

The � is the actual character that your browser shows. Not the best user experience, and definitely not the correct character.

I fixed that single row (via phpMyAdmin), and ran the ALTER TABLE MODIFY command again — same issue, another row. Looks like there is more than a single corrupt row.

I found a good way of rooting out all of the columns that will cause the conversion to fail. If you SELECT CONVERT (MyColumn USING utf8) as a new column, any NULL columns returned are columns that would cause the ALTER TABLE to fail.

For example:

mysql> SELECT MyID, MyColumn, CONVERT(MyColumn USING utf8)
       FROM MyTable
       WHERE CONVERT(MyColumn USING utf8) IS NULL
...
5 rows in set, 10 warnings (0.05 sec)

This showed me the specific rows that contained invalid UTF-8, so I hand-edited to fix them. You could manually NULL them out using an UPDATE if you’re not afraid of losing data. I had to do this for 6 columns out of the 115 columns that were converted.  Only 30 rows in total were corrupt.

Step 4. Profit?

At this point, it may take some guts for you to hit the go button on your live database.

Personally, I ran the script against a test (empty) database, then a copy of my live data, then a staging server before finally executing it on the live data.

Warning: Please be careful when using the script and test, test, test before committing to it!

The Script

The script can be found at Github: https://github.com/nicjansma/mysql-convert-latin1-to-utf8. If you find bugs or want to contribute changes, please head there.

Final Words

I hope what I’ve learned will be useful to others.

If you hit any problems with the conversion script, please let me know.

Android development and the Minifig Collector app

January 31st, 2011

About a year ago, I started taking a look at the Android OS as a platform for mobile application development.  I had recently purchased a Motorola Droid (on Verizon), and wanted to learn what it takes to develop an Android app.

My goal was to start out small, and learn the ins and outs of the platform.  I was also a bit rusty with Java, so I wanted to develop a small yet fun app.

Around the same time, LEGO came out with their Series 1 Collectible Minifigures (minifigs).  There are 18 minifigs to collect.  Inside each $3 bag sold is a single minifig, but the bags don’t say which minifig is inside.  You can feel around the contents of the bag and search for known shapes of their accessories, but that is tedious work and error prone.  Luckily someone noticed that each bag has two barcodes, and one of them is unique and consistent to each minifig.

So after a few weeks of coding, I released my first (free) app, called Minifig Collector on the Android marketplace:

It’s a pretty simple app, but it allows you to determine which minifig is in each bag by pointing your phone’s camera at the barcode, which is then scanned and decoded:

Once you’ve scanned the minifig, you can keep track of which ones you’ve found:

The app is pretty simple, but developing it gave me a good introduction to the platform and how the Android UI works.  I was able to utilize a free library called ZXing to handle the camera’s barcode scanning/decoding.

I’ve learned a lot about Android development along the way, and I am preparing a few follow-up posts about some of the things I’ve learned.  While the app is simple, I went a little overboard and explored unit testing, code style enforcement, continuous integration and debugging to see how it would work for future (paid) apps.  This lead to a bit of head-scratching about how to do various things, so I want to share my findings with others that are interested in Android development.

The app has about 5,000 downloads in the Android marketplace, and it looks like it’s helped a few people along the way.  Considering how much I’ve learned through its development, I’m pretty happy with my time investment in it.  I’m now working on a paid app.

Unfortunately, LEGO learned from community’s resourcefulness and removed the identifying barcodes in Series 3, though they left some identifying marks (‘bumpcodes’) on them so people can still (sometimes) determine which minifig is inside.  The phone’s camera can’t detect these codes, but the Minifig Collector been updated to show an image of the bumpcodes and still allows you to keep track of which Minifigs you have.

MySQL: TEXT vs. VARCHAR Performance

January 20th, 2011

Starting with MySQL 5.0.3, the maximum field length for VARCHAR fields was increased from 255 characters to 65,535 characters.  This is good news, as VARCHAR fields, as opposed to TEXT fields, are stored in-row for the MyISAM storage engine (InnoDB has different characteristics).  TEXT and BLOB fields are not stored in-row — they require a separate lookup (and a potential disk read) if their column is included in a SELECT clause.  Additionally, the inclusion of a TEXT or BLOB column in any sort will force the sort to use a disk-based temporary table, as the MEMORY (HEAP) storage engine, which is used for temporary tables, requires.  Thus, the benefits of using a VARCHAR field instead of TEXT for columns between 255 and 65k characters seem obvious at first glance in some scenarios: potentially less disk reads (for queries including the column, as there is no out-of-row data) and less writes (for queries with sorts including the column).

Following a review of my MySQL server instance’s performance metrics (mysqltuner.sh and mysql tuning-primer.sh point out interesting things), I found that approximately 33% of the server’s temporary tables used on-disk temporary tables (as opposed in-memory tables).  On-disk temporary tables can be created for several reasons, most notably if the resulting table will be larger than the minimum of MySQL’s tmp_table_size/max_heap_table_size variables OR when a TEXT/BLOB field is included in the sort.

Since any operation that uses the disk for sorting will be noticeably slower than using RAM, I started investigating why so many of my temporary tables went to disk.  One of my website’s most heavily used MyISAM tables contained two TEXT columns, which were textual descriptions of an object, submitted by the website’s visitors.  The table schema was created prior to MySQL v5.0.3, when VARCHAR fields were limited to 255 characters or less.  For this object, I wanted to allow descriptions larger than 255 characters, so the table utilized the two TEXT fields to store this data.

So my first thought in reducing the number of on-disk temporary tables was to convert these TEXT fields to VARCHAR.  I sampled the column’s values, and found that the two fields’ maximum input sizes were currently around 8KB.  Not thinking too much about what size I wanted to support, I decided that I could set both fields to VARCHAR(30000) instead of TEXT.  I changed the fields, verified through automated tests that everything still worked, and called it a night.

Over the next two days, I noticed that there were several alarming metrics trending the wrong way.  I utilize Cacti to monitor several hundred server health metrics — it is great for showing trends and alerting about changes.  Unfortunately, it was reporting that server load, page load time and disk activity were all up significantly — especially disk writes.  Wondering if it was a fluke, I left the server alone for another day to see if it would subside, but the high load and disk writes continued consistently.  The load was causing real user-perceived performance impacts for the website’s visitors, causing average page load time to increase from 70ms to 470ms.

Here’s what the Cacti graphs looked like:

Wouldn’t you be alarmed?

Not wanting to run an intensive performance review or diagnostics on the live website, I came up with a plan for how I would diagnose what the problem was:

  1. Enable temporary lightweight tracing on the server to try to determine the source of the increased disk activity (iotop or dstat).
  2. If MySQL is causing most of the disk activity, temporarily revert the VARCHAR(30000) columns back to TEXT, as I suspected they were somehow the cause of the slowdown.
  3. Perform a deeper analysis of the problem on a separate machine.

Running iotop on the server confirmed that a majority of the disk writes were coming from MySQL.  Additionally, after I reverted the columns to TEXT, the server load and page load times went back to normal.

So why did my seemingly obvious “performance win” end up as a performance dud?  Shame on me for not testing and verifying the changes before I pushed them live!

I didn’t want to do any more diagnosis of the problem on the live Linux server — there’s no point in punishing my visitors.  I have a separate Linux development server that I could have used to dig a little deeper, but I’m more comfortable doing performance analysis on Windows, and luckily, MySQL has Windows ports.

For almost all of my performance analysis work, I use the excellent Windows Performance Tools (WPT) Kit (using ETW and xperf).  If you haven’t used the WPT and xperf tools before, there are some good guides on trace capturing using ETW and visual analysis via xperfview on MSDN.  ETW is a lightweight run-time tracing environment for Windows.  ETW can trace both kernel (CPU, disk, network and other activity) and user mode (application) events, and saves them to a .ETL file that can later be processed.  The Windows Performance team and many other teams in Windows regularly use ETW/xperf for a majority of their performance analysis.

To figure out what’s going on with our VARCHAR columns, I first needed to ensure that I could replicate the problem on my Windows MySQL machine.  I installed MySQL 5 and loaded a snapshot of my live database into the instance.

I then looked at the MySQL Slow Query Log from the real server to see what queries were taking a long time (>1 second).  There were thousands of instances of a query that looked something like this:

SELECT   t2.*
FROM     table1 t1, table2 t2
WHERE    t2.t1id = t1.id
ORDER BY t1.id
LIMIT    0, 10

Which looks innocent enough, but Table2 is the table that I changed the two TEXT fields to VARCHAR, and I’m querying all of the columns (SELECT *) from it.  Before, because of the TEXT column, this query would’ve used an on-disk temporary table for the results (because the MySQL manual tells us this is the case for results that need temporary tables and have TEXT columns).  So why is this query appearing to be so much slower now?

First of all, I checked how this query responded on my Windows MySQL instance:

mysql> SELECT t2.*
FROM table1 t1, table2 t2
WHERE t2.t1id = t1.id
ORDER BY t1.id
LIMIT 0, 10;
...
10 rows in set (1.71 sec)

This confirmed the issue appeared on my development machine as well!  This query should be nearly instantaneous (less than 50ms), and 1,710 milliseconds is a long time to wait for a single query of many in a page load.

My guess at this point was it had something to do with temporary tables.  And disks.  Since that was what I was trying to improve with my TEXT to VARCHAR change, it only makes sense that I somehow made it worse.  So to validate this theory, I enabled a bit of lightweight ETW tracing to see how the OS handled the query.

I started ETW tracing to get basic CPU and disk IO information:

xperf -on base+fileio

Then I re-ran the above query in the MySQL command line, and saved the trace to disk (trace.etl) after the query finished:

xperf -d trace.etl

Loading the trace up in xperfview showed some interesting things:

xperfview trace.etl

From a CPU usage perspective, the CPU (on my quad-core system) seemed to be pretty active, but not 100% utilized.  Looking at the CPU Sampling Summary Table, I found that mysqld-nt.exe was using 1,830ms / 17% of my CPU (essentially 68% of one of the cores).  Not bad, but not maxed out either.  But what’s interesting here was the disk utilization graph.  For a period of ~700ms, we’re 100% utilized.  Highlighting that region and viewing the summary table showed where we spent our time:

#sql_1a00_0.MYD is a temporary table from MySQL (which can be confirmed from the File IO graph).  In this case, our single query caused 38MB of disk writes and ~626ms to write/read it back in.

Huh?

At this point, I wanted to double-check that the TEXT to VARCHAR change caused this.  I updated the column to TEXT, and re-run the same query:

mysql> SELECT t2.*
FROM table1 t1, table2 t2
WHERE t2.t1id = t1.id
ORDER BY t1.id
LIMIT 0, 10;
...
10 rows in set (0.03 sec)

Well, 0.03 seconds is a lot faster than 1.71 seconds.  This is promising.  I took another ETW trace of the query on with the TEXT field:

After switching back to TEXT fields, mysql used ~30ms of CPU and caused no disk activity.

Now that I knew what was causing the slowdown, I wanted to try to fully understand why this was the case.  Remember, I started down this path originally because I found that I had a high portion of temporary tables were on-disk temporary tables.  In the interest of seeing less disk activity on my server, I attempted to change several TEXT columns (which can cause on-disk temporary tables) to VARCHAR(30000) columns.  However, I didn’t fully look into what was causing the on-disk temporary tables, and instead just guessed.  As a result, my server’s perf tanked!

Now’s a good time to review the first paragraph of this post.  There are several reasons MySQL may use an internal temporary table.  One interesting quote:

Such a [temporary] table can be held in memory and processed by the MEMORY storage engine, or stored on disk and processed by the MyISAM storage engine.

and

If an internal temporary table is created initially as an in-memory table but becomes too large, MySQL automatically converts it to an on-disk table. The maximum size for in-memory temporary tables is the minimum of the tmp_table_size and max_heap_table_size values. This differs from MEMORY tables explicitly created with CREATE TABLE: such tables, the max_heap_table_size system variable determines how large the table is permitted to grow and there is no conversion to on-disk format.

So a temporary table can start out as a MEMORY table, then if MySQL realizes it’s too big for tmp_table_size and max_heap_table_size, it may convert it to a MyISAM table on the disk.  One caveat with the MEMORY engine is:

MEMORY tables use a fixed-length row-storage format. Variable-length types such as VARCHAR are stored using a fixed length.

How did this behavior affect me, and cause the 30ms query to take 1,710 ms?  Well let’s start out with the two newly VARCHAR(30000) columns.  In a normal MyISAM table, with a dynamically sized row, these two columns only take as much space as the data they contain (plus 1 byte).  That is, if I had a row and these two columns only had 10 bytes of data in them, the row size would be (10+1)*2+[other columns].  However, if I happened to convert this MyISAM table to use fixed-length rows, or I was using the MEMORY storage engine, the row size would be 30000*2+[other columns].  Currently, according to my dataset, these dynamically sized rows only required an average of 1,648 bytes per row.

And that’s the crux of the problem.  My query above, simple enough, requires a temporary table to do its work.  We can verify this via the MySQL EXPLAIN command:

mysql> EXPLAIN SELECT t2.*
FROM table1 t1, table2 t2
WHERE t2.t1id = t1.id
ORDER BY t1.id
LIMIT 0, 25;
+----+-------------+-------+--------+---------------+---------+---------+------
| id | select_type | table | key     | rows | Extra                           |
+----+-------------+-------+--------+---------------+---------+---------+------
|  1 | SIMPLE      | t2    | NULL    | 8813 | Using temporary; Using filesort |
|  1 | SIMPLE      | t1    | PRIMARY | 1    | Using index                     |
+----+-------------+-------+--------+---------------+---------+---------+------

(I trimmed a couple columns to fit to the page’s width).

Here, we see our t2 table Using temporary.  MySQL converted the 8,813 dynamic-row columns to fixed-length, which expanded the VARCHARS to their full size: approximately 60,600 byes per-row.  That’s 8,813 rows * 60,600 bytes = 534,067,800 bytes to deal with!  The server’s tmp_table_size variables decided this wasn’t good for an in-memory temporary table, so MySQL ended up moving a lot of this work to disk.  As a result, we had ~700ms of disk writes with this query when using VARCHAR(30000) columns.

There are a couple ways to avoid this behavior in MyISAM tables:

  1. Use TEXT fields, with their known caveats.
  2. Use a smaller, more reasonable VARCHAR size.  These fields probably don’t need to hold more than 10k of data.  One could reduce their size to 10k or smaller, or even move them to another table and only retrieve them when necessary.
  3. Fiddle with the tmp_table_size and max_heap_table_size variables.  These two variables dictate which queries use on-disk temporary tables, as described here.  They are set at approximately 35mb/16mb by default (on my Windows MySQL 5.1 instance).

I made a couple changes.  I changed one of the original 30k fields to 10k, and and changed the other one to 1k.  This reduced the potential row size in MEMORY temporary tables tremendously.  I also upped the tmp_table_size and max_heap_table_size variables to 128MB on my server.  The combination of these two changes ensured that the specific query above was no longer causing all of the performance issues (for now).  I should probably move the 10k field to another table (or back to TEXT) to be sure.

I probably didn’t need to use ETW and xperf here to look into things.  Since I was aware that the changes I made to the database had a high correlation with the slow-downs I was seeing, and reverting these changes fixed the issue, I could have probably figured out what was going on by reading the MySQL manual a bit more.  But I like to dig into things, and I think xperf can help visually communicate what’s going on with your system for problems like this.  Seeing how much blocking IO a single query can cause really sheds light on things!

Admittedly, the server MySQL is running on also hosts a web server and multiple sites.  A dedicated SQL server with fast, dedicated disks would help minimize problems like this.

One interesting note is that MySQL version 5.1 on Windows doesn’t have the same IO patterns as version 5.0 did – I see the same File IO for temporary tables with the VARCHAR fields, but not the same amount of disk activity.  This could mean that 5.1 memory maps or caches a lot of the on-disk temporary file and doesn’t end up actually writing it out to disk.  I am unsure why.

And again, InnoDB has different performance characteristics.  I did not do any testing on InnoDB.

So at the end of the day, my server is back to normal, and I’ve learned a little bit more about how MySQL works.  I hope you did too.