"Poo-tee-weet?"

URL Shorteners Suck. Roll Your Own.

URL shorteners are quite the topic of conversation following the announcement of the DiggBar last week—marking Digg’s impressive entry into this crowded and contentious field.

Back in the day, there was TinyURL, which turned that interesting web page you found into something you could easily slip into conversation, or confidently paste into an e-mail. But with the advent of Twitter, a whole ecosystem has sprung up to make sure you’re able to milk every last character, whether by clever uses of exotic TLDs” or recently, unicode characters.

Yet, for the convenience these services provide, there is a hidden cost to this transaction, as Joshua Schachter points out in his recent article:

The worst problem is that shortening services add another layer of indirection to an already creaky system. A regular hyperlink implicates a browser, its DNS resolver, the publisher’s DNS server, and the publisher’s website. With a shortening service, you’re adding something that acts like a third DNS resolver, except one that is assembled out of unvetted PHP and MySQL, without the benevolent oversight of luminaries like Dan Kaminsky and St. Postel.

If these shortener services were to suddenly vanish, their links would go with them. Sure, a majority of those links are of no greater significance than your average lolcat, but 5, 10 years down the line, we face losing a lot of meaningful content. It poses enough of a problem that there is now a project to systematically crawl all of these services to archive and preserve their links.

As users, we are left without much recourse: most SEO junkies and REST purists are pretty set in their ways of descriptive URLs, and Twitter’s core philosophy revolves around 140 characters—so that’s not changing anytime soon (short of Kottke’s proposed solutions). Fortunately, for those of us who make websites, we have the opportunity to roll our own solution, and reclaim our URLs.

DNS Diet

So you’re all gung-ho about preventing the link-rot apocolypse of the internet. Sweet! Now what?

For now, home-grown, site-specific short URLs seem to be the most sustainable solution. Not only do they reinforce your domain in the minds of the end-user, but they keep a closed loop on valuable analytics. And as long as you make sure to redirect intelligently, you can build it right on top of your existing site without sacrificing SEO.

Every website has its fair share of quirks, but if Amazon, a website rife with long, impenetrable URLs can make amazon.com/wii work as expected, surely you can implement something for your own blog or web application. For a starting point, you might want to check out something I made recently.

Midgt: Well, Here’s One Way of Doing It.

Midgt is a Ruby implementation of a simple, reversible algorithm that encodes numeric IDs into mixed-case alpha-numeric sequences that can be used as a drop-in component in web applications.

With some creative routing, you could change this:

http://ecommer.ce/catalog/products/94573

Into something like this:


http://ecommer.ce/nBo

The ID 94573 can alternatively be encoded to the shorter string, nBo. Custom routing can be used to trim the verbose hierarchy built around the ID down to a shorter alpha-numeric ID, which redirects back to it to it’s proper listing.

This example is admittedly contrived, but the utility of such an approach increases dramatically as the range of legal characters used to encode is expanded, and the number of unique IDs to represent continues to grow. For applications operating in terms of millions of items, this could mean 3 or 4 less characters. Since every character matters in this 140-count economy, why not? Besides, transparent numeric IDs are so 2006.

Check out Midgt for yourself on Github:

http://github.com/mattt/midgt/

Nothing too revolutionary, but I hope it’s a bit of inspiration for those of you now wanting to roll your own solution.