Beautification by Dirification?

Related: Read the resolution to this article here: Beautification Revisited

Although the subject of clean URLs has passed through the blogosphere plenty of times over the last year or two, I don’t feel there has been a definitive answer as to a) how important they are, and b) what the best way to implement them is. I plan on making a naming-convention change at Mike Industries this week and am soliciting user feedback as to the best way to go about it.

Are clean URLs even necessary?

Clearly the most compelling use of intelligent page naming is what is known in the industry as “vanity urls”. If we are calling out a special Lance Armstrong section on ESPN.com via the use of a television, radio, or print campaign, it is a huge advantage to point readers to something like “espn.com/lance”. A nice, clean, easy-to-remember URL is the only chance we have of planting information into our audience’s heads which may stick.

But what about the recent trend towards Googlefying URLs? In this case, you have a URL like:

http://www.mikeindustries.com/ blog/ archives/ 2004/ 04/ 20/ what_is_wrong_with_the_cottage_cheese_industry/

… which is neither memorable nor particularly vain. The idea, as I understand it, is mainly to pack the URL with keywords relevant to the subject of the page, so as to coax Google into awarding a higher page rank.

There appear to be advantages and disadvantages of naming pages in this fashion.

Conventional wisdom

Although I don’t claim to be an engineer, all the database classes I took during business school told me that database entries are to be identified by their keyfields. A keyfield contains a unique value (usually an incrementing number) which can never be duplicated across rows. Part of the value of a keyfield is that it acts as a persistent ID and never needs to be changed since it contains no information about the entry. Anything about the entry can be changed and it won’t affect the ID.

Traditionally, I have viewed CMS-generated pages the same way I have viewed entries in a database. The URL is generated from the unique keyfield and all of the content is contained within the page itself. This is evident in the default naming conventions of many flavors of blogging software, such as Movable Type. A Movable Type URL (up until version 3.0) contained an incrementing number, a file extension, and nothing more. Something like 00000038.php or 00000045.html. This makes for a nice, always-unique incrementing URL for each entry.

When I began first developing Mike Industries, I jumped right out of the gate with Movable Type 3.0. To my surprise, the CMS automatically began naming pages based on their page titles (dirified). I thought this was great, since naming URLs this way seemed to be in fashion at the time.

No sooner did I make my first entry though, that I realized the potential downside of dirified URLs. A few minutes after clicking “Publish”, I wanted to change my page title. As I reconsidered my page title, I began considering what might happen down the road if the same thing were to happen after people began linking to one of my dirified URLs. URLs are probably the only thing on the web which must remain constant. You can change practically anything you want about a page at any point in time, but once you change its URL, you sever its ties to the world.

Another disadvantage of using Movable Type’s new URL naming convention (or any other automatic dirification mechanism) is that often the URL will get truncated into something less than optimal. For instance, “how_to_lose_the_fatherhood_blues.php” can easily become “how_to_lose_the_fat.php”. Because the URL is automatically generated and truncated, these things tend to happen a lot.

A further disadvantage of dirified URLs is that using them as a tool to butter up Google is just that. Google has smartened up to keyword packing and all sorts of other schemes, so I’m sure they will eventually smarten up to URL packing. Additionally, it is unclear to me how much help a dirified URL really provides to one’s search engine ranking.

Given these annoyances, I decided to check the “Use Old-Style Archive Links” option in Movable Type and keep my URLs as incrementing numbers.

A vanity affair

So here I am, chugging along with the site, and a month into it, things are working great. No archiving problems, a few post-publishing title changes, and a general good feeling about the naming convention I chose. But then yesterday, I got an e-mail from Sean Madden telling me that I could create better URLs by simply adding “dirify=1” to my archive template. I knew this, of course, because I had hastily hashed out the situation in my head during the pre-launch stage, but the e-mail exchange which followed prompted a revisiting of the topic.

Following are some of the reasons I’m looking at dirifying my URLs again:

  1. People seem to have a nasty habit of linking to my stories and not including any text between the anchors. Some of this is probably because blog authors have URL autolinking turned on. This results in me seeing links on the web to my stuff which aren’t identifiable until I mouseover or click them.
  2. If I am trying to type one of my URLs into the location bar of my browser, the browser will suggest past pages I’ve visited within my domain but I can’t identify any of them because they are merely numeric.
  3. I’m starting to just get really vain about what appears in the address bar when I’m on a page. I realize this is silly, but sometimes it seems like part of the page, and if I’m a designer, I should be able to make it look better, right?

While we’re on the subject…

This isn’t directly related to the issue at hand, but it involves URL naming schemes so I’ll mention it anyway. Both on Mike Industries and certain major commercial sites I work on, I was thinking it would be nice to set up something like this: What if you could type in a URL like —

http://www.mikeindustries.com/thoughts on validation

… and the server would look for that exact URL. If no URL was found, you would automatically be redirected to not just a 404 page, not just a search page, but a search results page which was prepopulated with results from the terms in the address bar? If there was only one search result, maybe you’d be even be automatically redirected to that page (kind of like an “I Feel Lucky” for lazy people).

Anyway, perhaps this has been done before, but if it hasn’t, I wouldn’t be surprised if The Wolf has something cooked up by tomorrow morning. He’s kinda good.

Back to the program

Unless anyone can tell me that I was right about my initial instincts to use incrementing numbers, I think I’m going to give dirified URLs a shot. Especially since I can group entries into directories organized by year and month, the chances of having two identical titles in the same month is nil. I just need to stop the annoying habit of changing page titles after I publish.

Here is what I need feedback on:

  1. I use PHP on all of my pages. I’d like to hide the PHP extension from viewers. The two ways I know of doing this are a) creating a directory structure where each entry has its own directory and the entry itself is stored in “index.php” inside that directory, or b) using .htaccess and a ModRewrite to serve up “whatever.php” when “whatever” (no extension) is called. I don’t like A because it seems extraneous… I don’t want to be creating directories for every file I create. But I don’t like B because I’m not sure exactly how to implement it in a seamless way both on my server and in Movable Type. With B, I can get the extension hiding working, but MT still wants to create links with the extensions on them. Also with B, I need to be able to identify when a URL without an extension is actually a directory, so the index.php file can be served up appropriately instead of serving up directoryname.php.
  2. I_don’t_like_underscores_and_I_never_will. They remind me of when I had to take Mac files and throw underscores in them just so my less-able Windows comrades could use them. They also aren’t visible when viewed as a hyperlink because the underline occupies the same place the underscore does. I’d like to use hyphens instead. Are there any easy ways to do this? Hacks to MT templates maybe?
  3. Is there any way to limit the amount of truncation in a URL? It seems like Movable Type tends to make them quite short. Is there a maximum recommended length of a filename on the web in the first place?
  4. Is there any functional difference between unchecking the “Use Old-Style Links” box in MT and just adding “dirify=1” to the archive template?

And so there you have it. I appreciate any advice anyone is willing to provide, even if it is to stick with the old school style of URL naming.

Like this entry? You can follow me on Twitter here, subscribe via email here, or get the RSS feed if that's how you roll.

53 Responses:

  1. Another Sensational Design

    A couple of days ago I mentioned how much I liked the design at Sonnenvogel.com. I’ve just found another site…

  2. Mavromatic Now Has Future Proof URLs

    It seems like the latest trend in blogging is to use future proof URLs. Well, I spent some time talking to Mike Davidson and a few other co-workers as to why I should actually make the change. It turn…

  3. […] Last week’s post on dirified URLs was supposed to bring about some sort of consensus opinion on smart URL-naming conventions. Thanks to everyone who posted their very helpful and enlightening comments, but in the end, we only discovered more options and came to no mutual conclusions. It appears that people just look for different things in their URLs and what you do with yours is up to you. […]

Shared
The Ocean in 185 Lines of Javascript:

Mesmerizing. Try tweaking some of the variables in the “sea” section of the code.

“"Design had been a vertical stripe in the chain of events in a product’s delivery; at Apple, it became a long horizontal stripe, where design is part of every conversation.””
Why I Just Asked My Students To Put Their Laptops Away:

A great essay about how toxic everyday distractions can be.

Humanity's deep future:

A group of researchers at the Future of Humanity Institute talk about where our race may be going and how artificial intelligence could save or kill us all.

Steve Jobs speaks about the future at the International Design Conference in 1983:

31 years later, it’s safe to say this is one of the most prescient speeches about technology ever delivered. Jobs covers wireless networking, tablets, Google StreetView, Siri, and the App Store (among other things) many years before their proliferation. A fantastic listen.

How to travel around the world for a year:

Great advice for when you finally find the time.

LiveSurface:

A fantastic app for prototyping your design work onto real world objects like billboards, book covers, and coffee cups. This seems like just as great of a tool for people learning design as it does for experts.

50 problems in 50 days:

One man’s attempt to solve 50 problems in 50 days using only great design. Some good startup ideas in here…

How to Do Philosophy:

If you’ve ever suspected that most classical philosophy is a colossal waste of time, Paul Graham tells you why you’re probably right.

TIME: Why Medical Bills Are Killing Us:

Stephen Brill follows the money to uncover the pinnacle of corruption that is the U.S. Health Care system. A must-read article if there ever was one.

DIY Dot Org:

A beautifully designed site full of fun and challenging DIY projects. I could spend months on here.

The Steve Jobs Video Archive:

A collection of over 250 Steve Jobs videos in biographical order

Self-portraits from an artist under the influence of 48 different psychoactive drug combos.

Water Wigs are pretty amazing.

David Pogue proposes to his girlfriend by creating a fake movie trailer about them and then getting a theater to play it before a real movie. Beautiful and totally awesome.