Archive

Archive for the ‘Tech’ Category

b2evolution 0.9 to Wordpress 2.7 Migration Script

April 1st, 2009

For the previous 5 years, nicj.net ran on a blogging platform called b2evolution, one of the hundreds of CMS (content-management systems) available.  b2evolution was pretty revolutionary for its time, and worked good as a platform in 2003.  Over the years, the version I had been running (0.9) became rather old and out-dated.  There were probably a few unpatched security holes in the version I was running, but I was getting frustrated at paying the “upgrade cost” to keep my website up-to-date with the latest releases.  Each new release took time to upgrade and invariably caused problems that I would have to spend time debugging.  Lately, there was a big problem with comment spam, which can be annoying to keep on top of.  b2evolution doesn’t have a good solution to reduce comment spam, whereas several newer blogging platforms use Akismet to help fight it.

So over the last year, I’ve been looking for a new blogging platform.  Wordpress is the successor I chose.  Wordpress is similar to b2volution (both spawned from the same parent), but what impresses me the most about Wordpress is how clean the interface feels and how powerful the blogging platform is with all of the community plug-ins available.  Upgrades are seamless as well (as simple as a subversion ‘update’ command).

The problem was what to do with all of the old posts on nicj.net in the b2evolution MySQL database.  Not that anyone cares what I wrote during my junior year of college, but I was hoping to at least be able to archive some of those old posts as ‘private’ so I could have a journal that I could look back to at a later date.  I was sure someone had done b2evolution to Wordpress posts/comment/users migration before, and sure-enough, there are several scripts out there:

The problem is, each script has its caveats: migrating from a specific b2evolution version to a specific Wordpress version, and each has their own incompatabilities with my situation.

I finally settled on one from tumahler.com, and modified it a bit to suit my own migration needs.  Changes are:

  1. Supports b2evolution 0.9 to Wordpress 2.7 migration
  2. Script can read from one b2evolution MySQL DB and write to a different Wordpress MySQL DB
  3. Migrates posts marked as b2evolution private to Wordpress private
  4. Sets categories for posts
  5. Only posts and comments are migrated (not users, metadata or categories)

So I’m adding my script to the mix of publically available b2evolution -> Wordpress migration scripts, in case it will help anyone:

migrate_b2evolution_to_wordpress.php (b2evo 0.9 to WP 2.7 tested)

Please note: Only use this on a clean WordPress install, as it will delete any WP posts, comments and metadata before the migration of b2evo data.

Ship It

November 9th, 2006

We’re done. Windows Vista is now RTM’d (released to manufacturer).

Windows Vista

Today we released Windows Vista to manufacturing. It’s out of our hands now. The Microsoft Office team also RTM’d Office 2007 on Monday. Both will be available to consumers on January 30.

Time to celebrate!

The Code Book

October 18th, 2006

Just finished reading The Code Book, by Simon Singh. I loved it. Great history of ciphers and cryptography through the ages.

Some favorite select quotes:

Phil Zimmermann, author of PGP:

Cryptography used to be an obscure science, of little relevance to everyday life. Historically, it always had a special role in the military and diplomatic communications. But in the Information Age, cryptography is about political power and in particular, about the power relationship between a government and its people. It is about the right to privacy, freedom of speech, freedom of political association, freedom of the press, freedom from unreasonable search and seizure, freedom to be left alone.

In the past, if the government wanted to violate the privacy of ordinary citizens, it had to expend a certain amount of effort to intercept and steam open and read paper mail, or listen to and possibly transcribe spoken telephone conversations. This is analogous to catching fish with a hook and a line, one fish at a time. Fortunately for freedom and democracy, this kind of labour-intensive monitoring is not practical on a large scale. Today, electronic mail is gradually replacing conventional paper mail, and is soon to be the norm for everyone, not the novelty it is today. Unlike paper mail, e-mail messages are just too easy to intercept and scan for interesting keywords. This can be done easily, routinely, automatically, and undetectable on a grand scale. This is analogous to driftnet fishing – making a quantitative and qualitative Orwellian difference to the health of democracy.

Whitfield Diffie, pioneer of public-key cryptography:

In the 1790s, when the Bill of Rights was ratified, any two people could have a private conversation – with a certainty no one in the world enjoys today – by walking a few meters down the road and looking to see no one was hiding in the bushes. There were no recording devices, parabolic microphones, or laser interferometers bouncing off their eyeglasses. You will note that civilization survived. Many of us regard that period as a golden age in American political culture.

Ronald L. Rivest, of RSA fame. The Case against Regulating Encryption Technology:

But it is poor policy to clamp down indiscriminantly on a technology merely because some criminals might be able to use it to their advantage. For example, any U.S. citizen can freely buy a pair of gloves, even though some criminal might use them to commit a crime without leaving fingerprints. Anyone can freely buy a personal computer too, even though a burglar might use them to ransack a house without leaving fingerprints.

I rather like the glove analogy; let me expand on it a bit. Cryptography is a data protection technology just as gloves are a hand protection technology. Cryptography protects data from hackers, corporate spies and con artists, whereas gloves protect hands from cuts, scrapes, heat, cold and infection. The former can frustrate FBI wiretapping, and the latter can thwart FBI fingerprint analysis. Cryptography and gloves are both dirt-cheap and widely available. In fact, you can download good cryptographic software from the Internet for less than the price of a good pair of gloves!

Let’s talk about Wiki

May 12th, 2006

I like Wikis. What is a Wiki? Well, according to Wikipedia, it is “a website that allows users to easily add, remove, or otherwise edit all content, very quickly and easily, sometimes without the need for registration“. For example, you log onto a website, see that there’s a spelling error, hit the “Edit” button to correct it, and it’s instantly updated for everyone else. It’s a simple idea, but very powerful.

The website Wikipedia is a prime example of a Wiki — it is an online encyclopedia created by, and edited by, anyone and everyone. To date, there are over a million English articles on Wikipedia, and there are articles in hundreds of other languages as well. Anyonmous internet users create and edit these articles, keeping them up to date and accurate. I use Wikipedia almost as much as Google these days when I’m doing research. See below for other sites based around the idea of a Wiki.

A wiki allows for a community of people (often the internet community as a whole) to keep an up-to-date record of their knowledge.

So anyone, anonymously, can edit a wiki. Doesn’t this make Wikipedia, and similar sites prone to abuse, defacing, or inaccurate data? You bet’cha. But the great thing about a wiki is that as easy as it is to make changes, it is easy to revert these changes to a previous version (all old versions are saved, and there is good revision control). A wiki is only as strong as its’ community, but in the case of places like Wikipedia, there is a large global community of people who want to keep the content up to date, unbiased, and accurate.

There are also several other projects based around the idea of a Wiki. The Wikimedia Foundation has several free projects such as:
* Wikipedia – encyclopedia
* Wikibooks – textbooks and manuals
* Wikiquotes – quotes database
* Wikisource – source documents
* Wiktonary – dictionary
* Wikinews – news
* Wikispecies – directory of species

Very cool stuff.

I’ve also begun using a Wiki a personal organizer. I’ve converted much of My Documents into a wiki on a website that I own (that only I can read and edit). Why do this? Well, it provides me:
* A place to jot down notes on random things, like stuff that I might need to remember later. For example, notes on how to use X or Y, problems I’m working on at work (and how I solved them), how I did this or that, what I need to do later, ideas for projects, etc.
* Access from anywhere (home, work or my phone)
* Revision control (every change is saved)
* A simple interface to edit documents

I started using my wiki (I call it NiciWiki!) two weeks ago and I’m still using it quite a bit. It’s a bit slower than editing a document on my computer, but it provides the advantages above so I think it’s worth it.

There’s even a cool implementation of a wiki called TiddlyWiki that you can save on your hard drive — it doesn’t require any web server, and you can edit it any time (think storing it on a USB thumb drive in your pocket).

Fire drill!!!

April 8th, 2006

So there I am, sitting at my desk at work, and my computer explodes with the sound of BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP BEEP. Rats! I had to quickly smack the power button on my computer before my officemate got annoyed (losing all my work!).

Welcome to the world of backwards compatability — the Bell Character. A remnant of MS-DOS, many years ago. See, back in the days when there were just characters on the screen, no fancy-smancy graphical interfaces, programmers wanted to be able to add a little spice to their programs. So they decided that whenever this one particular character, ASCII code 7 (^G or if you hit control-G) was displayed on the console, it would also make the computer beep. Kind of cool, right? You could alert the user to something important if you needed. Your computer doesn’t even need to have speakers attached, all computers come with a small buzzer to make that ‘beep’ sound.

This ‘feature’ still exists today. I was searching through files on my computer (with ‘findstr’ on the command prompt), and it accidently searched a non-text file. And this non-text file had the character 7 in it, hundreds of times. So when findstr printed it out, it decided to make a hundred lound beeping sounds, in a row. There is no way to stop this. You can try it for yourself. Hit Win-R, type ‘cmd’, hit enter, hit ctrl-G, then enter. Your computer will beep.

I don’t believe there is any way to turn it off.

What’s your backup plan?

March 15th, 2006

This sounds very cliche, but my life is very… digital. So many parts of my life revolve around bits of information stored on disks and drives in various places. Documents, pictures, music, emails, projects — the list goes on.

We live in an age where this is very common. But have you ever thought about what would happen if you lost it all? How would you feel? Sure, losing some of your MP3s isn’t the end of the world. But what about the other stuff? Maybe I’m an outlier on the bell curve for this, but there is so much data that I have created and I want to have forever.

How could you lose your data? Unfortunately, it’s pretty easy:

Problems:

  1. Media failure: CDs get scratched, and will often only store your data for a few years. DVDs tell a similar story.
  2. Hardware failure: Hard drives crash (the average lifetime is probably around the same as CDs).
  3. User failure: “oops! I hit delete!”
  4. Catastrophic events (flood, fire), but those are a bit harder to plan for.

I’ve always backed up bits and pieces to various places, but a few weeks ago I decided I needed to take a serious look at protecting all of this information. The problem will only get worse — I will continually produce data for the rest of my life, and most of it I will want to save. My “My Documents” folder is already 40GB and growing. Other data/media is nearly 500GB.

So what’s my backup plan? Let’s keep it simple — reduce the risks of Problems #1, #2 (media and hardware failure) and #3 (user error). Problem #4 is a bit harder, but there is an easy solution if you just want to plan for worst-case scenario.

But first, you have to decide what to backup. I can think of three categories of data:

Categories of Data:

  1. My personal documents. This includes school work, source code, emails, financial information, etc. I want to be able to save this stuff forever. Additionally, data from the websites needs to be backed up. Losing it would be catastrophic. This is priority 1.
  2. Media. Movies, music, pictures, TV shows. If I lost this stuff, not a big worry — I just lose time re-acquiring. I can imagine that in the future this won’t even be a problem (bandwidth will be irrelevent). So not mission-critical, but helpful if I had some sort of redundancy. Priority 2.
  3. Operating systems. When an OS crashes, it usually just costs time, but that can be one of the most frustrating experiences and ruin a whole day. There are two classes of machines (for me): home and servers. I don’t mind losing my home machines for a day to do an OS rebuild, but losing my web server for a day costs revenue. Priority 3 for home machines, Priority 2 for servers.

So a backup plan needs to be multi-tiered. One solution won’t work for every thing. So how do you protect yourself from losing data? What options are available?

Solutions:

  1. Media. CDs and DVDs.

    Advantages: Cheap (both media and burners). Portable.
    Negatives: Unreliable due to scratching and life expectancy. Easy to misplace. Low capacity (5GB for DVDs). Staleness of data (it’s a pain to do backups each week).

    This solution should work for most people if they don’t need a lot of stuff backed up.

  2. Hard Drives (External hard drive). A USB or Firewire drive that is connected when you want to backup your data.

    Advantages: Portable. High-capacity (500GB and growing). Fast.
    Negatives: Costly. Same unreliability as media (3 years?). Potential to break (dropping). Same staleness problem as media unless you have it automated.

    Probably the best solution for most people who have more data than media will allow.

  3. RAID. RAID basically trades storage on one of your drives for redundancy (a backup). The best part is, the redundancy is handled automatically — if one drive crashes, the other drives have enough information that they will continue to work with no service interruptions.

    Advantages: High-capacity (multiple drives can work together). Fast (depending on implementation). High availability.
    Negatives: Costs more than drives alone (you need RAID controller cards). Not portable (hard to move from computer to computer). Lose capacity (trading space for redundancy).

    Probably overkill for many people, but excellent when you have to protect a lot of data in an environment that you need high-availability.

  4. Drives on other machines. Using free space on other computers (on a network) to backup data.

    Advantages: Cheap (utilize space that isn’t used elsewhere). Fast. Can be automated to provide daily (or better) backups.
    Negatives: You need extra computers to make this work.

  5. Revision control (Keeping older versions of documents in case you need to revert to an earlier version).

    This solution is mainly geared toward Problem #3, User error.

    Advantages: Changes can be removed.
    Negatives: Extra space is needed for each version.

    I utilize this for source code where I may be required to back-out a change if I find problems.

My Backup Plan

So what do I do?

My personal documents: Since this is Priority 1 for me, I have four levels of backup.

  1. My Documents resides on a 4 disk, 750GB RAID-5 array (4x 250GB Western Digital WD2500YD disks). These drives come with a 5-year warranty (most consumer drives carry 3 years). I can lose any one of 4 drives, and all of my data will still be safe.
  2. Bi-weekly, I backup to an external hard drive. Afterward, this is placed in a fire-proof safe.
  3. I’ve burned my most essential documents onto DVD and have sent them to my parents in another state. This protects against some catastrophic events (fire) — and if both my home and my parents go up in fire on the same day, I have more important things to worry about (the end of the world perhaps?).
  4. Revision control for my source code and some other documents, keeping all old versions.

My web server data, which contains things such as the web pages, databases and server configuration files. Also Priority 1. This data has three levels of backup.

  1. Important data from the web server is backed up nightly with a 7-day history.
  2. This data is backed up onto a second hard drive on the same machine nightly as well. The second drive is essentially a mirror of the first drive.
  3. Weekly, this data is backed up to my home server.

My media, which I would probably go “eh” if I lost it, but I’m utilizing free space from the above solutions to provide backup.

  1. I’ve copied much of the data I had on CD and DVD onto the RAID-5 array (about 300GB total).
  2. The rest is still stored on CDs and DVDs where I won’t touch them unless need-be.

My machine OSs. I have a home machine, a home server (which does backup jobs and TiVo stuff), and my web server (which powers this and other websites).

  1. My home machine’s OS is not backed up. I don’t mind having to reinstall Windows if need-be. If I was worried, I could switch to RAID-1.
  2. My home server backs up important configuration files to another drive nightly.
  3. My web server has two drives. The 2nd drive does a nightly mirror of the first drive, so it can replace the first drive with minimal downtime.
  4. My home server also downloads the web server’s important data weekly to its own backup drive.

The home machines are all backed up with UPS power supplies. UPS is very important for my home computer, which has 6 hard drives spinning constantly. When you lose power, some disks might have trouble “slowing down” and could break. The UPS connects to the home machine and provides ~5 minutes of backup power (complete with the LCD monitor and networking if need-be), then automatically shuts Windows down gracefully.

Oh, I also backup my parent’s documents over the internet weekly — I don’t think they care all too much, but I’d like to make sure they’re safe, regardless.

So how could I lose it all? I’m sure there are flaws in my plan, but I’m protected on many levels against multiple problems. Nothing is perfect, but I do feel a lot safer knowing my data is safe.

Phew.

What do you do?