Archive

Archive for the ‘Tech’ Category

adblock-detector.js

March 22nd, 2014

I run advertising on several of my websites, mostly through Google AdSense. My sites are free communities that don’t otherwise sell products, so advertising is the main way I cover operational expenses. AdSense has been a great partner over the years and the ads they serve aren’t too obtrusive.

However, I realize that many people see all advertising as annoying, and some run ad-blockers in their browser to filter out ads. AdBlock Plus and others are becoming more popular every year.

Since advertising is such an important part of my business, I wanted to try to quantify what percentage of ads were being hidden by my visitor’s ad-blockers. I did a bit of testing to determine how to detect if my ads were being blocked, then ran an experiment on two of my sites. The first site, with a travel focus, saw approximately 9.4% of ads being blocked by visitors. The second site, with a gaming focus, had over 26% of ads blocked.  The industry average is around 23%.

While the ad-block rates are fairly high, I’m honestly not upset or surprised by the results. Generally, people that have an ad-blocker installed won’t be the kind of audience that is likely to click on an ad. In fact, I often run an ad-blocker myself. However, knowing which visitors have blocked the ads gives me an important metric to track in my analytics. It also offers me the opportunity to give one last plea to the visitor by subtly asking them to support the site via donations if they visit often.

What I don’t want to do is annoy any visitors that are using ad-blockers with my plea, but I do think there’s an opportunity, if you’re respectful with your request, to gently suggest to the visitor an alternate method of supporting the site. Below are screenshots of what sarna.net looks like if you visit with an ad-blocker installed.

sarna-ad-blocker-full

Zoomed in, you can see I provide the visitor alternate means of supporting the site, as well as a way to disable the message for 100 days if they find it annoying:

sarna-ad-blocker-zoomed

Since this prompt is text-only and a muted color, I feel that it is an unobtrusive, respectful way of reaching out to the visitor.  So far, I haven’t had any complaints about the new prompt — and I’ve had a few donations as well.  A very small percentage click on the “hide this message…” link.

The logic to detect ad-blocking is fairly straightforward, though there are a few caveats when detecting cross-browser.  Other sites might find it useful, so I’ve packaged it up into a new module called adblock-detector.js.  I’ve only tested it in a limited environment (IE, Chrome and Firefox with AdBlock Plus), so I’m looking for help from others that can test other browsers, browser versions, ad-blockers and ad publishers.

You can use adblock-detector.js to collect metrics on your ad-block rate, or to appeal to your visitors as I’m doing.  I provide examples for both in the repository.

Please use the knowledge gained for good (eg. analytics, subtle prompts), not evil (eg. more ads).

If you want a fully-baked solution, I would also recommend PageFair, which can help you track your ad-block rate, and more.

adblock-detector.js is free, open-source, and available on Github

Using Phing for Fun and Profit

February 12th, 2014

I gave a small presentation on using Phing as a PHP build system for GrPhpDev on 2014-02-11. It’s available on SlideShare:

Using Phing for Fun and Profit

The presentation and examples are also available on Github.

ChecksumVerifier – A Windows Command-Line Tool to Verify the Integrity of your Files

February 4th, 2014

Several years ago I wrote a small tool called ChecksumVerifier. It maintains a database of files and their checksums, and helps you verify that the files have not changed. I use it on my external hard drive backups to validate that the files are not being corrupted due to bitrot or other disk corruption.  At the time I created it, there were several other simple and commercial Windows apps that would do the same thing, but nothing was free and command-line based.  I’ve been meaning to clean it up so I could open-source it, which I finally had the time to do last weekend.

A checksum is a small sequence of 20-200 characters (depending on the specific algorithm used) that is calculated by reading the input file and applying a mathematical algorithm to its contents. Even a file as large as 1TB will only have a small 20-200 character checksum, so checksums are an efficient way of saving the file’s state without saving it’s entire contents. ChecksumVerifier uses the MD5, SHA-1, SHA-256 and SHA-512 algorithms, which are generally collision resistant enough for validating the integrity of file contents.

One example usage of ChecksumVerifier is to verify the integrity of external hard drive backups. After saving files to an external disk, you can run ChecksumVerifier -update to calculate the checksums of all of the files on the external disk. At a later date, if you want to validate that the files on the disk have not been added, removed or changed, you can run ChecksumVerifier -verify and it will re-calculate all of the disks’ checksums and compare them to the original database to see if any files have been changed in any way.

ChecksumVerifier is pretty flexible and has several command line options:

Usage: ChecksumVerifier.exe [-update | -verify] -db [xml file] [options]

actions:
     -update:                Update checksum database
     -verify:                Verify checksum database

required:
     -db [xml file]          XML database file

options:
     -match [match]          Files to match (glob pattern such as * or *.jpg or ??.foo) (default: *)
     -exclude [match]        Files to exclude (glob pattern such as * or *.jpg or ??.foo) (default: empty)
     -basePath [path]        Base path for matching (default: current directory)
     -r, -recurse            Recurse (directories only, default: off)

path storage options:
     -relativePath           Relative path (default)
     -fullPath               Full path
     -fullPathNodrive        Full path - no drive letter

checksum options:
     -md5                    MD5 (default)
     -sha1                   SHA-1
     -sha256                 SHA-2 256 bits
     -sha512                 SHA-2 512 bits

-verify options:
     -ignoreMissing          Ignore missing files (default: off)
     -showNew                Show new files (default: off)
     -ignoreChecksum         Don't calculate checksum (default: off)

-update options:
     -removeMissing          Remove missing files (default: off)
     -ignoreNew              Don't add new files (default: off)
     -pretend                Show what would happen - don't write out XML (default: off)

ChecksumVerifier is free, open-source and available on github.

breakup.js

April 14th, 2013

It’s not you, it’s me.

A few months ago I released a small JavaScript micro-framework: breakup.js

Serially enumerating over a collection (such as using async.forEachSeries()in Node.js or jQuery.each() in the browser) can lead to performance and responsiveness issues if processing or looping through the collection takes too long. In some browsers, enumerating over a large number of elements (or doing a lot of work on each element) may cause the browser to become unresponsive, and possibly prompt the user to stop running the script.

breakup.js helps solve this problem by breaking up the enumeration into time-based chunks, and yielding to the environment if a threshold of time has passed before continuing.  This will help avoid a Long Running Script dialog in browsers as they are given a chance to update their UI.  It is meant to be a simple, drop-in replacement for async.forEachSeries().  It also provides breakup.each() as a replacement for jQuery.each() (though the developer may have to modify code-flow to deal with the asynchronous nature of breakup.js).

breakup.js does this by keeping track of how much time the enumeration has taken after processing each item.  If the enumeration time has passed a threshold (the default is 50ms, but this can be customized), the enumeration will yield before resuming.  Yielding can be done immediately in environments that support it (such as process.nextTick() in Node.js and setImmediate() in modern browsers), and will fallback to a setTimeout(..., 4) in older browsers.  This yield will allow the environment to do any UI and other processing work it wants to do.  In browsers, this will help reduce the chance of a Long Running Script dialog.

breakup.js is primarily meant to be used in a browser environment, as Node.js code is already asynchronously driven. You won’t see a Long Running Script dialog in Node.js. However, you’re welcome to use the breakup Node.js module if you want have more control over how much  time your enumerations take.  For example, if you have thousands of items to enumerate and you want to process them lazily, you could set the threshold to 100ms with a 10000ms wait time and specify the forceYield parameter, so other work is prioritized.

Check it out on Github or via npm.

SaltThePass.com

March 26th, 2013

As many geeks do, I have a collection of about 30-odd domain names that I’ve purchased over the past few years for awesome-at-the-time ideas that I just never found the time to work on.

Last month, I resolved stop collecting these domains and instead make some visible progress on them, one at a time.

SaltThePass is my first project.  Do you have an account on LinkedIn, Evernote, or Yahoo?  All of these sites had password breaches in the last year that compromised their user’s logins and passwords.  One big problem people face today is managing all of the passwords they use for all of the sites that they visit.  People often re-use the same password on many sites because it would be impossible to remember hundreds of different passwords.  Unfortunately, this means that if a single site is hacked and your password is revealed, the attacker may have access to your account on all of the other sites you visit.

To help solve this problem, I created SaltThePass.com.  Salt The Pass is a password generator that will help you generate unique, secure passwords for all of the websites you visit based on a single Master Password that you remember.  You don’t need to install any additional software, and you can access your passwords from anywhere you have internet access.

Check it out at https://saltthepass.com and let me know what you think!

Using Modern Browser APIs to Improve the Performance of Your Web Applications

February 26th, 2013

Last night I gave a short presentation on Using Modern Browser APIs to Improve the Performance of Your Web Applications at GrWebDev.

It’s available on SlideShare:

Usinng Modern Browser APIs SlideShare deck

Two other presentations I gave late last year are available here as well:

Switch your HTPC back to Media Center after logging out of Remote Desktop

May 23rd, 2012

I have a Windows 7 Media Center PC hooked up to the TV in our living room.  It’s paired to a 4-stream Ceton CableCard adapter and is great for watching both TV and movies.

Sometimes I need to Remote Desktop (RDP) into the machine to install updates or make other changes.  During this, and after logging out, the Media Center PC is left at the login-screen on the TV.  So the next time I sit down to watch TV, I have to find the wireless keyboard and enter my password to log back in.

Since this can get annoying, I’ve created a small script on the desktop that automatically switches the console session (what’s shown on the TV) back to the primary user and re-starts Media Center.  This way, the next person that uses the TV doesn’t have to log back in.  When I’m done in the RDP session, I simply start the batch script and it logs me out of RDP and logs the TV back in.

Here’s the simple script:

call %windir%\system32\tscon.exe 1 /dest:console
start "Media Center" /max %windir%\ehome\ehshell.exe
exit /b 1

PngOutBatch: Optimize your PNGs by running PngOut multiple times

May 15th, 2012

PngOut is a command-line tool that can losslessly reduce the file size of your PNGs. In many cases, it can reduce the size of a PNG by 10-15%. I’ve even seen some cases where it was able to reduce the file size by over 50%.

There are several other PNG compression utilties out there, such as pngcrush and AdvanceCOMP, but I’ve found PngOut to be the best optimizer most of the time.

There’s an excellent tutorial on PngOut for first-timers.  Running PngOut is pretty easy, simply run it once agaist your PNG:

PngOut.exe [image.png]

However, to get the best optimization of your images, you can run PngOut multiple times with different block sizes (eg, /b1024) and randomized initial tables (/r).

There’s a commercial program, PngOutWin that can run through all of the block sizes using multiple CPU cores, but I wanted something free that I could run from the command line.

To aid in this, I created a simple DOS batch script that runs PngOut through 9 different block sizes (from 0 to 8192), with each block size run multiple times with random initial tables.

While the first iteration of PngOut does all of the heavy lifting, I’ve sometimes found that using the different block sizes can eek out a few extra bytes (sometimes 100-bytes or more than the initial pass).  You may not care about optimizing your PNG to the absolute last byte possible, but I try to run any new PNGs ready for production in my websites and mobile apps through this batch script before they’re committed to the wild.

Running PngOutBatch is as easy as running PngOut:

PngOutBatch.cmd [image.png] [number of iterations per block size - defaults to 5]

PngOutBatch will show progress as it reduces the file size.  Here’s a sample compressing the PNG logo from libpng.org:

Blocksize: 0
Iteration #1: Saved 2529 bytes
Iteration #2: No savings
Iteration #3: No savings
Iteration #4: No savings
Iteration #5: No savings
Blocksize: 128
Iteration #1: Saved 606 bytes
Iteration #2: Saved 10 bytes
Iteration #3: No savings
Iteration #4: Saved 2 bytes
Iteration #5: No savings
Blocksize: 192
Iteration #1: No savings
Iteration #2: No savings
Iteration #3: No savings
Iteration #4: No savings
Iteration #5: No savings
Blocksize: 256
Iteration #1: Saved 1 bytes
Iteration #2: No savings
Iteration #3: Saved 5 bytes
Iteration #4: Saved 11 bytes
Iteration #5: No savings
Blocksize: 512
Iteration #1: No savings
Iteration #2: No savings
Iteration #3: No savings
Iteration #4: No savings
Iteration #5: No savings
Blocksize: 1024
Iteration #1: No savings
Iteration #2: No savings
Iteration #3: No savings
Iteration #4: No savings
Iteration #5: No savings
Blocksize: 2048
Iteration #1: No savings
Iteration #2: No savings
Iteration #3: No savings
Iteration #4: No savings
Iteration #5: No savings
Blocksize: 4096
Iteration #1: No savings
Iteration #2: No savings
Iteration #3: No savings
Iteration #4: No savings
Iteration #5: No savings
Blocksize: 8192
Iteration #1: No savings
Iteration #2: No savings
Iteration #3: No savings
Iteration #4: No savings
Iteration #5: No savings
D:\temp\test.png: SUCCESS: 17260 bytes originally, 14096 bytes final: 3164 bytes saved

The first block size (0) reduced the file by 2529 bytes, then the 128-byte block size further reduced it by 606, 10 then 2 bytes. The 192-byte block size didn’t help, but a 256-byte block size reduced the file size by 1, 5 then 11 more bytes.  Larger block sizes didn’t help, but at the end of the day we reduced the PNG by 3164 bytes (18%), and 635 bytes (25% more) than if we had only run it once.

The PngOutBatch.cmd script is hosted at Gist.Github if you want to use it or contribute changes.

DIY Cloud Backup using Amazon EC2 and EBS

February 20th, 2012

I’ve created a small set of scripts that allows you to use Amazon Web Services to backup files to your own personal “cloud”. It’s available at GitHub for you to download or fork.

Features

  • Uses rsync over ssh to securely backup your Windows machines to Amazon’s EC2 (Elastic Compute Cloud) cloud, with persistent storage provided by Amazon EBS (Elastic Block Store)
  • Rsync efficiently mirrors your data to the cloud by only transmitting changed deltas, not entire files
  • An Amazon EC2 instance is used as a temporary server inside Amazon’s data center to backup your files, and it is only running while you are actively performing the rsync
  • An Amazon EBS volume holds your backup and is only attached during the rsync, though you could attach it to any other EC2 instance later for data retrieval, or snapshot it to S3 for point-in-time backup

Introduction

There are several online backup services available, from Mozy to Carbonite to Dropbox. They all provide various levels of backup services for little or no cost. They usually require you to run one of their apps on your machine, which backs up your files periodically to their “cloud” of storage.

While these services may suffice for the majority of people, you may wish to take a little more control of your backup process. For example, you are trusting their client app to do the right thing, and for your files to be stored securely in their data centers. They may also put limits on the rate they upload your backups, change their cost, or even go out of business.

On the other hand, one of the simplest tools to backup files is a program called rsync, which has been around for a long time. It efficiently transfers files over a network, and can be used to only transfer the parts of a file that have changed since the last sync. Rsync can be run on Linux or Windows machines through Cygwin. It can be run over SSH, so backups are performed with encryption. The problem is you need a Linux rsync server somewhere as the remote backup destination.

Instead of relying on one of the commercial backup services, I wanted to create a DIY backup “cloud” that I had complete control of. This script uses Amazon Web Services, a service from Amazon that offers on-demand compute instances (EC2) and storage volumes (EBS). It uses the amazingly simple, reliable and efficient rsync protocol to back up your documents quickly to Amazon’s data centers, only using an EC2 instance for the duration of the rsync. Your backups are stored on EBS volumes in Amazon’s data center, and you have complete control over them. By using this DIY method of backup, you get complete control of your backup experience. No upload rate-limiting, no client program constantly running on your computer. You can even do things like encrypt the volume you’re backing up to. NOTE: As of 2014-05-21, EBS volumes can be encrypted automatically.

The only service you’re paying for is Amazon EC2 and EBS, which is pretty cheap, and not likely to disappear any time soon. For example, my monthly EC2 costs for perfoming a weekly backup are less than a dollar, and EBS costs at this time are as cheap as $0.10/GB/mo.

These scripts are provided to give you a simple way to backup your files via rsync to Amazon’s infrastructure, and can be easily adapted to your needs.

How It Works

This script is a simple DOS batch script that can be run to launch an EC2 instance, perform the rsync, stop the instance, and check on the status of your instances.

After you’ve created your personal backup “cloud” (see Amazon Cloud Setup), and have the Required Tools, you simply run the amazon-cloud-backup.cmd -start to startup a new EC2 instance. Internally, this uses the Amazon API Developer Tools to start the instance via ec2-run-instances. There’s a custom bootscript for the instance, amazon-cloud-backup.bootscript.sh that works well with the Amazon Linux AMIs to enable root access to the machine over SSH (they initially only offer the user ec2-user SSH access). We need root access to perform the mount of the volume.

After the instance is started, the script attaches your personal EBS volume to the device. Its remote address is queried viaec2-describe-instances and SSH is used to mount the EBS volume to a backup point (eg, /backup). Once this is completed, your remote EC2 instance and EBS volume are ready for you to rsync.

To start the rsync, you simply need to run amazon-cloud-backup.cmd -rsync [options]. Rsync is started over SSH, and your files are backed up to the remote volume.

Once the backup is complete, you can stop the EC2 instance at any time by running amazon-cloud-backup.cmd -stop, or get the status of the instance by running amazon-cloud-backup.cmd -status. You can also check on the free space on the volume by running amazon-cloud-backup.cmd -volumestatus.

There are a couple things you will need to configure to set this all up. First you need to sign up for Amazon Web Services and generate the appropriate keys and certificates. Then you need a few helper programs on your machine, for example rsync.exe and ssh.exe. Finally, you need to set a few settings in amazon-cloud-backup.cmd so the backup is tailored to your keys and requirements.

Amazon “Cloud” Setup

To use this script, you need to have an Amazon Web Services account. You can sign up for one at https://aws.amazon.com/. Once you have an Amazon Web Services account, you will also need to sign up for Amazon EC2.

Once you have access to EC2, you will need to do the following.

  1. Create a X.509 Certificate so we can enable API access to the Amazon Web Service API. You can get this in your Security Credentials page. Click on the X.509 Certificates tab, then Create a new Certificate. Download both the X.509 Private Key and Certificate files (pk-xyz.pem and cert-xyz.pem).
  2. Determine which Amazon Region you want to work out of. See their Reference page for details. For example, I’m in the Pacific Northwest so I chose us-west-2 (Oregon) as the Region.
  3. Create an EC2 Key Pair so you can log into your EC2 instance via SSH. You can do this in the AWS Management Console. Click on Create a Key Pair, name it (for example, “amazon-cloud-backup-rsync”) and download the .pem file.
  4. Create an EBS Volume in the AWS Management Console. Click on Volumes and then Create Volume. You can create whatever size volume you want, though you should note that you will pay monthly charges for the volume size, not the size of your backed up files.
  5. Determine which EC2 AMI (Amazon Machine Image) you want to use. I’m using the Amazon Linux AMI: EBS Backed 32-bit image. This is a Linux image provided and maintained by Amazon. You’ll need to pick the appropriate AMI ID for your region. If you do not use one of the Amazon-provided AMIs, you may need to modify amazon-cloud-backup.bootscript.sh for the backup to work.
  6. Create a new EC2 Security Group that allows SSH access. In the AWS Management Console, under EC2, open the Security Groups pane. Select Create Security Group and name it “ssh” or something similar. Once added, edit its Inbound rules to allow port 22 from all sources “0.0.0.0/0″. If you know what your remote IP address is ahead of time, you could limit the source to that IP.
  7. Launch an EC2 instance with the “ssh” Security Group. After you launch the instance, you can use the Attach Volume button in theVolumes pane to attach your new volume as /dev/sdb.
  8. Log-in to your EC2 instance using ssh (see Required Toolsbelow) and fdisk the volume and create a filesystem. For example:
    ssh -i my-rsync-key.pem ec2-user@ec2-1-2-3-4.us-west-1.compute.amazonaws.com
    [ec2-user] sudo fdisk /dev/sdb
    ...
    [ec2-user] sudo mkfs.ext4 /dev/sdb1
  9. Your Amazon personal “Cloud” is now setup.

Many of the choices you’ve made in this section will need to be set as configuration options in the amazon-cloud-backup.cmd script.

Required Tools

You will need a couple tools on your Windows machine to perform the rsync backup and query the Amazon Web Services API.

  1. First, you’ll need a few binaries (rsync.exe, ssh.exe) on your system to facilitate the ssh/rsync transfer. Cygwin can be used to accomplish this. You can easily install Cygwin from http://www.cygwin.com/. After installing, pluck a couple files from the bin/folder and put them into this directory. The binaries you need are:
    rsync.exe
    ssh.exe
    sleep.exe

    You may also need a couple libraries to ensure those binaries run:

    cygcrypto-0.9.8.dll
    cyggcc_s-1.dll
    cygiconv-2.dll
    cygintl-8.dll
    cygpopt-0.dll
    cygspp-0.dll
    cygwin1.dll
    cygz.dll
  2. You will need the Amazon API Developer Tools, downloaded from http://aws.amazon.com/developertools/. Place them in a sub-directory called amazon-tools\

Script Configuration

Now you simply have to configure amazon-cloud-backup.cmd.

Most of the settings can be left at their defaults, but you will likely need to change the locations and name of your X.509 Certificate and EC2 Key Pair.

Usage

Once you’ve done the steps in Amazon “Cloud” Setup, Required Tools and Script Configuration, you just need to run the amazon-cloud-backup.cmd script.

These simple steps will launch your EC2 instance, perform the rsync, and then stop the instance.

amazon-cloud-backup.cmd -launch
amazon-cloud-backup.cmd -rsync
amazon-cloud-backup.cmd -stop

After -stop, your EC2 instance will stop and the EBS volume will be un-attached.

Source

The source code is available at GitHub. Feel free to send pull requests for improvements!

Windows command-line regular expression renaming tool: RenameRegex

January 30th, 2012

Every once in a while, I need to rename a bunch of files.  Instead of hand-typing all of the new names, sometimes a nice regular expression would get the job done a lot faster.  While there are a couple Windows GUI regular expression file renamers, I enjoy doing as much as I can from the command-line.

Since .NET exposes an easy to use library for regular expressions, I created a small C# command-line app that can rename files via any regular expression.

Usage:

RR.exe file-match search replace [/p]
  /p: pretend (show what will be renamed)

You can use .NET regular expressions for the search and replacement strings, including substitutions (for example, “$1″ is the 1st capture group in the search term).

Examples:

Simple rename without a regular expression:

RR.exe * .ext1 .ext2

Renaming with a replacement of all “-” characters to “_”:

RR.exe * "-" "_"

Remove all numbers from the file names:

RR.exe * "[0-9]+" ""

Rename files in the pattern of “123_xyz.txt” to “xyz_123.txt”:

RR.exe *.txt "([0-9]+)_([a-z]+)" "$2_$1"

Download

You can download RenameRegex (RR.exe) from here.  The full source of RenameRegex is also available at GitHub if you want to fork or modify it. If you make changes, let me know!