Several years ago I wrote a small tool called ChecksumVerifier. It maintains a database of files and their checksums, and helps you verify that the files have not changed. I use it on my external hard drive backups to validate that the files are not being corrupted due to bitrot or other disk corruption. At the time I created it, there were several other simple and commercial Windows apps that would do the same thing, but nothing was free and command-line based. I’ve been meaning to clean it up so I could open-source it, which I finally had the time to do last weekend.
A checksum is a small sequence of 20-200 characters (depending on the specific algorithm used) that is calculated by reading the input file and applying a mathematical algorithm to its contents. Even a file as large as 1TB will only have a small 20-200 character checksum, so checksums are an efficient way of saving the file’s state without saving it’s entire contents. ChecksumVerifier uses the MD5, SHA-1, SHA-256 and SHA-512 algorithms, which are generally collision resistant enough for validating the integrity of file contents.
One example usage of ChecksumVerifier is to verify the integrity of external hard drive backups. After saving files to an external disk, you can run ChecksumVerifier -update
to calculate the checksums of all of the files on the external disk. At a later date, if you want to validate that the files on the disk have not been added, removed or changed, you can run ChecksumVerifier -verify
and it will re-calculate all of the disks’ checksums and compare them to the original database to see if any files have been changed in any way.
ChecksumVerifier is pretty flexible and has several command line options:
Usage: ChecksumVerifier.exe [-update | -verify] -db [xml file] [options] actions: -update: Update checksum database -verify: Verify checksum database required: -db [xml file] XML database file options: -match [match] Files to match (glob pattern such as * or *.jpg or ??.foo) (default: *) -exclude [match] Files to exclude (glob pattern such as * or *.jpg or ??.foo) (default: empty) -basePath [path] Base path for matching (default: current directory) -r, -recurse Recurse (directories only, default: off) path storage options: -relativePath Relative path (default) -fullPath Full path -fullPathNodrive Full path - no drive letter checksum options: -md5 MD5 (default) -sha1 SHA-1 -sha256 SHA-2 256 bits -sha512 SHA-2 512 bits -verify options: -ignoreMissing Ignore missing files (default: off) -showNew Show new files (default: off) -ignoreChecksum Don't calculate checksum (default: off) -update options: -removeMissing Remove missing files (default: off) -ignoreNew Don't add new files (default: off) -pretend Show what would happen - don't write out XML (default: off)
ChecksumVerifier is free, open-source and available on github.