ChecksumVerifier – A Windows Command-Line Tool to Verify the Integrity of your Files

February 4th, 2014

Several years ago I wrote a small tool called ChecksumVerifier. It maintains a database of files and their checksums, and helps you verify that the files have not changed. I use it on my external hard drive backups to validate that the files are not being corrupted due to bitrot or other disk corruption.  At the time I created it, there were several other simple and commercial Windows apps that would do the same thing, but nothing was free and command-line based.  I’ve been meaning to clean it up so I could open-source it, which I finally had the time to do last weekend.

A checksum is a small sequence of 20-200 characters (depending on the specific algorithm used) that is calculated by reading the input file and applying a mathematical algorithm to its contents. Even a file as large as 1TB will only have a small 20-200 character checksum, so checksums are an efficient way of saving the file’s state without saving it’s entire contents. ChecksumVerifier uses the MD5, SHA-1, SHA-256 and SHA-512 algorithms, which are generally collision resistant enough for validating the integrity of file contents.

One example usage of ChecksumVerifier is to verify the integrity of external hard drive backups. After saving files to an external disk, you can run ChecksumVerifier -update to calculate the checksums of all of the files on the external disk. At a later date, if you want to validate that the files on the disk have not been added, removed or changed, you can run ChecksumVerifier -verify and it will re-calculate all of the disks’ checksums and compare them to the original database to see if any files have been changed in any way.

ChecksumVerifier is pretty flexible and has several command line options:

Usage: ChecksumVerifier.exe [-update | -verify] -db [xml file] [options]

actions:
     -update:                Update checksum database
     -verify:                Verify checksum database

required:
     -db [xml file]          XML database file

options:
     -match [match]          Files to match (glob pattern such as * or *.jpg or ??.foo) (default: *)
     -exclude [match]        Files to exclude (glob pattern such as * or *.jpg or ??.foo) (default: empty)
     -basePath [path]        Base path for matching (default: current directory)
     -r, -recurse            Recurse (directories only, default: off)

path storage options:
     -relativePath           Relative path (default)
     -fullPath               Full path
     -fullPathNodrive        Full path - no drive letter

checksum options:
     -md5                    MD5 (default)
     -sha1                   SHA-1
     -sha256                 SHA-2 256 bits
     -sha512                 SHA-2 512 bits

-verify options:
     -ignoreMissing          Ignore missing files (default: off)
     -showNew                Show new files (default: off)
     -ignoreChecksum         Don't calculate checksum (default: off)

-update options:
     -removeMissing          Remove missing files (default: off)
     -ignoreNew              Don't add new files (default: off)
     -pretend                Show what would happen - don't write out XML (default: off)

ChecksumVerifier is free, open-source and available on github.

Minifig Collector v11.0

October 8th, 2013

My original Minifig Collector app (which was the first Android app I ever created), which has seen over 150,000 installs, just got a major facelift and some new features version 11.0.  It now has a more modern-looking UI, can import/export your figures to Brickset, and let’s you finger-swipe back and forth. Check it out!

Screenshots:

main list
browse brickset

Unofficial LEGO® Minifigure Catalog v2.0

September 13th, 2013

Over the past few weeks I’ve been working on a new version 2.0 of the Unofficial LEGO® Minifigure Catalog app. We’ve just released the version 2.0 to the Apple iTunes and Google Play App stores.

Version 2.0 introduces tablet support along with a complete visual facelift. In addition, there are several performance improvements that make the app much faster when browsing, and I’ve added the ability to browse minifigures, sets and heads by name (in addition to by year and theme).

all-3

Check it out!

Note: LEGO® is a trademark of the LEGO Group of companies which does not sponsor, authorize or endorse this app.

SaltThePass mobile app now available on iTunes, Google Play and Amazon

July 31st, 2013

A few months ago I released SaltThePass.com, which is a password generator that will help you generate unique, secure passwords for all of the websites you visit based on a single Master Password that you remember.

I’ve been working on a mobile / offline iOS and Android app that gives you all of the features of the saltthepass.com website.  The apps are now in the Apple iTunes App Store, Google Play App Store and the Amazon Appstore.

Let me know if you use them!
iPad, iPhone and Android apps available

How to deal with a WordPress wp-comments-post.php SPAM attack

May 9th, 2013

This morning I woke up to several website monitoring alarms going off.  My websites were becoming intermittently unavailable due to extremely high server load (>190).  It appears nicj.net had been under a WordPress comment-SPAM attack from thousands of IP addresses overnight.  After a few hours of investigation, configuration changes and cleanup, I think I’ve resolved the issue.  I’m still under attack, but the changes I’ve made have removed all of the comment SPAM and have reduced the server load back to normal.

Below is a chronicle of how I investigated the problem, how I cleaned up the SPAM, and how I’m preventing it from happening again.

Investigation

The first thing I do when website monitoring alarms are going off (I use Pingdom and Cacti) is to log into the server and check its load.  Load is an indicator of how busy your server is.  Anything greater than the number of CPUs on your server is cause for alarm.  My load is usually around 2.0 — when I logged in, it was 196:

[nicjansma@server3 ~]$ uptime
06:09:48 up 104 days, 11:25,  1 user,  load average: 196.32, 167.75, 156.40

Next, I checked top and found that mysqld was likely the cause of the high load because it was using 200-1000% of the CPU:

top - 06:16:45 up 104 days, 11:32, 2 users, load average: 97.69, 162.31, 161.74
Tasks: 597 total, 1 running, 596 sleeping, 0 stopped, 0 zombie
Cpu(s): 3.8%us, 19.1%sy, 0.0%ni, 10.7%id, 66.2%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 12186928k total, 12069408k used, 117520k free, 5868k buffers
Swap: 4194296k total, 2691868k used, 1502428k free, 3894808k cached

PID   USER  PR NI VIRT RES  SHR  S %CPU  %MEM TIME+ COMMAND
24846 mysql 20 0 26.6g 6.0g 2.6g S 260.6 51.8 18285:17 mysqld

Using SHOW PROCESSLIST in MySQL (via phpMyAdmin), I saw about 100 processes working on the wp_comments table in the nicj.net WordPress database.

I was already starting to guess that I was under some sort of WordPress comment SPAM attack, so I checked out my Apache access_log and found nearly 800,000 POSTS to wp-comments-post.php since yesterday.  They all look a bit like this:

[nicjansma@server3 ~]$ grep POST access_log
36.248.44.7 - - [09/May/2013:06:07:29 -0700] "POST /wp-comments-post.php HTTP/1.1" 302 20 "http://nicj.net/2009/04/01/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1;)"

What’s worse, the SPAMs were coming from over 3,000 unique IP addresses.  Essentially, it was a distributed denial of service (DDoS) attack:

[nicjansma@server3 ~]$ grep POST access_log | awk '{print $1}' | sort | uniq -c | wc -l
3105

NicJ.net was getting hundreds of thousands of POSTS to wp-comments-post.php, which was causing Apache and MySQL to do a whole lot of work checking them against Akismet for SPAM and saving in the WordPress database.  I logged into the WordPress Admin interface, which verified the problem as well:

There are 809,345 comments in your spam queue right now.

Yikes!

Stopping the Attack

First things first, if you’re under an attack like this, the quickest thing you can do to stop the attack is by disabling comments on your WordPress site.  There are a few ways of doing this.

One way is to go into Settings > Discussion > and un-check Allow people to post comments on new articles.

The second way is to rename wp-comments-post.php, which is what spammers use directly to add comments to your blog.  I renamed my file wp-comments-post.php.bak temporarily, so I could change it back later.  In addition, I created a 0-byte placeholder file called wp-comments-post.php so the POSTS will look to the spammers like they succeeded, but the 0-byte file takes up less server resources than a 404 page:

[nicjansma@server3 ~]$ mv wp-comments-post.php wp-comments-post.php.bak && touch wp-comments-post.php

Either of these methods should stop the SPAM attack immediately.  5 minutes after I did this, my server load was back down to ~2.0.

Now that the spammers are essentially POSTing data to your blank wp-comments-post.php file, new comments shouldn’t be appearing in your blog.  While this will reduce the overhead of the SPAM attack, they are still consuming your bandwidth and web server connections with their POSTs.  To stop the spammers from even sending a single packet to your webserver, you can create a small script that automatically drops packets from IPs that are posting several times to wp-comments-post.php.  This is easily done via a simple script like my Autoban Website Spammers via the Apache Access log post.  Change THRESHOLD to something small like 10, and SEARCHTERM to wp-comments-post.php and you will be automatically dropping packets from IPs that try to post more than 10 comments a day.

Cleaning up the Mess

At this point, I still had 800,000+ SPAMs in my WordPress moderation queue.  I feel bad for Akismet, they actually classified them all!

I tried removing the SPAM comments by going to Comments > Spam > Empty Spam, but I think it was too much for Apache to handle and it crashed.  Time to remove them from MySQL instead!

Via phpMyAdmin, I found that not only were there 800,000+ SPAMs in the database, the wp_comments table was over 3.6 GB and the wp_commentmeta was at 8.1 GB!

Here’s how to clean out the wp_comments table from any comments marked as SPAM:

DELETE FROM wp_comments WHERE comment_approved = 'spam';

OPTIMIZE TABLE wp_comments

In addition to the wp_comments table, the wp_commentmeta table has metadata about all of the comments. You can safely remove any comment metadata for comments that are no longer there:

DELETE FROM wp_commentmeta WHERE comment_id NOT IN (SELECT comment_id FROM wp_comments)

OPTIMIZE TABLE wp_commentmeta

For me, this removed 800,000+ rows of wp_comments (bringing it down from 3.6 GB to just 207 KB) and 2,395,512 rows of wp_commentmeta (bringing it down from 8.1 GB to just 136 KB).

Preventing Future Attacks

There are a few preventative measures you can take to stop SPAM attacks like these.

NOTE: Remember to rename your wp-comments-post.php.bak (or turn Comments back on) after you’re happy with the prevention techniques you’re using.

  1. Disable Comments on your blog entirely (Settings > Discussion > Allow people to post comments on new articles.) (probably not desirable for most people)
  2. Turn off Comments for older posts (spammers seem to target older posts that rank higher in search results). Here’s a way to disable comments automatically after 30 days.
  3. Rename wp-comments-post.php to something else, such as my-comments-post.php. Comment spammers often just assume your code is at the wp-comments-post.php URL and won’t check your site’s HTML to verify this is the case. If you rename wp-comments-post.php and change all occurrences of that URL in your theme, your site should continue to work while the spammers hit a bogus URL. You can follow this renaming guide for more details.
  4. Enable a Captcha for your comments so automated bots are less likely to be able to SPAM your blog. I’ve had great success with Are You A Human.
  5. The Autoban Website Spammers via the Apache Access log post describes my method for automatically dropping packets from bad citizen IP addresses.

After all of these changes, my server load is back to normal and I’m not getting any new SPAM comments.  The DDoS is still hitting my server, but their IP addresses are slowly getting packets dropped via my script every 10 minutes.

Hopefully these steps can help others out there.  Good luck! Fighting spammers is a never-ending battle!