[Reader-list] The URL as User Interface

Thu Feb 24 01:13:14 IST 2005

Hello all,

Continuing from last month on how user interface affects online  
community, I look now at the URL, the Universal Resource Locator, as a  
critical element of user interface. Jakob Nielsen wrote an excellent  
summary in 1999 [1]. His essay is dated but still relevant. What I have  
here is a collection of examples, showcasing both good and bad use.

[1] http://www.useit.com/alertbox/990321.html

- - - - -

First, a somewhat technical explanation. Here's an example URL:

http://www.sarai.net/community/fellow.htm

What looks like a single line is actually several parts:

The "http:" prefix refers to the protocol used to access the page.
The "//" sequence indicates that what follows is a server name
"www.sarai.net" is the server where the resource (page) may be located.
"/community/fellow.htm" is the path to the resource on the server.

W3C's URI (Universal Resource Identifier) specification [2] explains  
this in better detail. For all practical purposes, URIs are the same as  
URLs (all URLs are URIs, and URIs that are not URLs are hard to come  
by).

Windows users will notice the striking similarity between the URI  
syntax's "//servername" and Windows Networking's "\\servername"  
notation. Windows paths are URL-syntax-derived, but not valid URLs  
because the "protocol:" prefix is missing. Some applications compensate  
by requiring a prefix of "file:" (Windows) or "smb:" (Linux, Mac).  
Browsers typically prefix "http:" if none is specified. The use of  
backslashes instead of forward slashes in Windows dates to a  
conflicting use in MS-DOS 1.0 [3]. Windows internally supports use of  
either type of slash, but presents only backslashes in the UI.

[2] http://www.w3.org/Addressing/URL/uri-spec.html
[3] http://en.wikipedia.org/wiki/Slash_(punctuation)#Computing

- - - - -

URLs play several roles, sometimes conflicting. We will look at these  
today:

1. URLs as brand identifiers.
2. URLs as permanent archival paths.
3. URLs exposing server architecture.

It is important to realise that a URL is not a hyperlink. A hyperlink  
is a reference from one (usually HTML) resource to another, where the  
other resource is identified by its URL. The hyperlink itself is not  
the URL. Hyperlinks have their own influence on user interface, but  
that is for discussion another day.

- -

1. URLs as brand identifiers. Consider these example URLs:

http://www.apple.com/ipod
http://www.microsoft.com/office
http://www.boingboing.net/2005/02/23/fake_astronaut_scams.html

Contrast:

http://www.plusthought.org/article.php3?story_id=58
http://timesofindia.indiatimes.com/articleshow/1021545.cms
http://www.linuxjournal.com/article/3882
http://news.postnuke.com/modules.php? 
op=modload&name=News&file=article&sid=2666

Notice that the first set of URLs gives you a fairly good idea of what  
each is about, while the second doesn't.

While URLs are ideally hidden from users, masked by the page's title  
and content, the page being accessed only via links from other pages,  
but in practice this is not how it works. Browsers display URLs  
prominently. Passing links via email requires users to cut and paste  
the URL, and a missing or changed character can mean a broken link.  
Recent phishing scams make it even more important to be aware of the  
current page's URL.

Reality is, users read URLs, and site administrators who care about  
their sites being accessible should use readable URLs. The ideal URL is  
one you can read out on the phone to another person, who should be able  
to type it in without errors. Let's look closely at some of the above  
examples.

http://www.apple.com/ipod
Think of any Apple brand. QuickTime, Mac OS X, iMac, iPod, iTunes. Any  
brand. Write that brand name in lower case, remove spaces, and stick it  
to the end of apple.com/. Note that the page you expect comes up. Apple  
is legendary for their attention to user interface, and it extends to  
their website.

http://www.microsoft.com/office
This one is a googly. It looks like a clean URL, but click on it and  
you are redirected to "http://office.microsoft.com/en-us/default.aspx",  
which is no longer a URL you can remember off the top of your head.  
Unlike as with the Apple site, it's not obvious that you can find the  
right page by going to microsoft.com/brandname (it works, but you find  
out only by testing for it).

http://www.plusthought.org/article.php3?story_id=58
This URL tells you nothing of what to expect when you click. Imagine if  
you had a site with URLs like this and you were replying to email  
asking for details about some programme.

"Yes, the programme is still open. We have details at our website.  
Please go to http://www.plusthought.org/article.php ..." umm, php3 or  
php4 or just php? ... umm, what is the story id number? ... open  
browser ... realise you are offline, wait several seconds for dialup to  
complete, open front page of site, realise you can't see the link  
because it's an image, curse at how slow dialup is, wait for all images  
to load, and there it is! Copy and paste in email. Or if you are in a  
hurry, you'll just say "please go to our website and click on the  
Wanderer link," which is sub-optimal. Imagine if you could just say,  
"yes, please go to plusthought.org/wanderer".

Disclosure: I have previously worked with the fine folks at Synapse and  
they are fully aware of this problem and intend to fix it. Despite  
their poor URLs, they do excellent information design; by far the best  
I've seen anywhere.

http://timesofindia.indiatimes.com/articleshow/1021545.cms
http://www.linuxjournal.com/article/3882
Unlike Synapse's site, these two are news sites with regularly updated  
content, making it hard to have short URLs for everything. At least  
they're not as bad as the following:

http://news.postnuke.com/modules.php? 
op=modload&name=News&file=article&sid=2666
This is inexcusable. Not only is it lacking context, it is long enough  
to be unreliably reproduced in email. Several mail clients will wrap  
text at 72 or 80 columns, and even if yours doesn't, the mailing list  
software (notably Yahoo! Groups) or the recipient's mail client may. A  
wrapped URL is an unusable URL. Not everyone understands how or cares  
enough to join the lines and open the link.

http://www.boingboing.net/2005/02/23/fake_astronaut_scams.html
This URL is an example of how even a regularly updated news site can  
have meaningful URLs. The numbers are clearly a date, which tells you  
how old this page is, and the filename is a fragment of the headline.  
It's enough to (a) let you decide if you want to open it, and (b) makes  
it easier to identify if you have already seen this (if someone  
forwards you a link you have already seen, it's likely to be recent and  
interesting enough that the headline appears familiar). This URL scheme  
is standard with the Movable Type blogging software. Contrast with the  
URLs generated by LiveJournal and MSN Spaces [4].

<http://sify.com/news/fullstory.php? 
id=13672097&headline=Groom~runs~away,~guest~marries~bride>
Sometimes when you are not in a position to fix your server software, a  
temporary kludge like this helps. It's ugly, but it's better than a  
meaningless number. Also notice that I encased this URL in angle  
brackets. Most mail clients understand that to indicate that the URL  
must not be wrapped, or if it arrived wrapped, to piece it back into a  
single line.

[4] http://www.livejournal.com/users/evan_tech/86155.html

- -

2. URLs as archival path. Web founder Tim Berners-Lee argues that this  
role is by far the most important. URLs should not change. A URL  
pointing to a particular resource should continue pointing to the same  
resource 2, 20 or 200 years from now [5]. Take, for example, another  
page from Apple's site:

http://www.apple.com/imac

You'll see there Apple's marketing pitch for their LCD-based iMac G5.  
The same page a year ago would have shown the lampshade-like iMac G4,  
and even earlier, the CRT-based iMac G3, all of which are entirely  
different computers. By emphasising the branding role, Apple's URLs  
fail to serve the archival role. Sometimes this is a conscious  
decision. Perhaps Apple wants to keep simple URLs for their most  
current products, and aren't concerned about discontinued products  
showing up at expected URLs.

More often however, this is the result of poor planning. For example,  
my own photo album. (Apologies for the personal plug here, but I didn't  
have a better example at hand). Last December I visited the Tibetian  
settlements in Bylakuppe in southern Karnataka and posted pictures  
here:

http://jace.seacrow.com/pics/places/bylakuppe

Earlier in the year, a friend and I drove to Madurai. I took pictures  
along the way, and now had two sets: pictures taken in Madurai, and  
pictures taken in Tamil Nadu outside Madurai. A hierarchical  
organisation made sense:

http://jace.seacrow.com/pics/places/tn
http://jace.seacrow.com/pics/places/tn/madurai

Earlier to this I visited Mysore and Karwar, both in Karnataka:

http://jace.seacrow.com/pics/places/mysore
http://jace.seacrow.com/pics/places/karwar

Notice the hierarchy is no longer consistent. Tamil Nadu pictures are  
in their own folder, while Karnataka pictures are scattered in the  
upper level. Ideally I'd place Bylakuppe, Karwar and Mysore in a  
Karnataka folder. In practice, this would mean changing URLs,  
undermining their permanency. In previous situations like this, I've  
setup redirectors so links don't break, but this is tedious work.  
Hierarchical systems inevitably change as the library grows. A URL that  
reflects hierarchy is friendly but not guaranteed permanent. A URL that  
simply shows a database id conveys little information, and is still at  
risk of impermanency if the database system is upgraded in future and  
all ids change. Berners-Lee recommends that URLs should be date-stamped  
in such situations [5], which incidentally is the method adopted by  
Movable Type, as shown in the example from BoingBoing.net earlier.

[5] http://www.w3.org/Provider/Style/URI

- -

3. URLs exposing server architecture.

(Coming soon. It's way past midnight and I'm sleepy.)

-- 
Kiran Jonnalagadda
http://www.pobox.com/~jace