Sunday, April 8, 2007

404-letter words

The time has come to celebrate yet another of those all-too-unrecognized geek-centric holidays (which I may have just made up): 404 Day! Every April 4th, Web surfers of every persuasion should take time out to celebrate that one universal experience of all Internet consumers and professionals -- the 404 Page Not Found error. No matter which sites you frequent, which ISP you use, or which operating end of the browser zealot spectrum you fall on, we've all had our share of 404s.

So, where did the 404 come from (besides the server, of course)? Like pretty much everything World Wide Web-related, the 404 is an official component of the Hypertext Transfer Protocol (HTTP) specification ratified by the World Wide Web Consortium (W3C).

It first appeared in the version 0.9 HTTP spec, adopted in 1992. If you track down that document, you'll notice a rather telling signature: TimBL. That's the byline of one Tim Berners-Lee, he of the "I invented the World Wide Web and the first Web browser" fame. The same guy who made the modern Web page possible also invented the Page Not Found.

Genius though he was, Berners-Lee didn't spin the HTTP status codes out of whole cloth but based them on the preexisting File Transfer Protocol (FTP) status codes. If you compare the two code listings, you'll find only 10 overlapping codes: 100, 200, 202, 425, 426, 500, 501, 502, 503, and 504.

Only 100 and 200 have similar meanings under both standards -- OK and Continue, respectively -- so it's clear Berners-Lee didn't copy FTP into HTTP. For the record, there is no code 404 in FTP, so that infamous error message is original to the Hypertext Transfer Protocol by way of TimBL.

Rumor has it that, whether or not Berners-Lee suspected that code 404 would become famous by virtue of link rot and lazy sysadmins, he intended that particular numeric to include a sly inside joke. You see, the HTTP status code system bears a striking resemblance to the CERN laboratory building numbering system. CERN, the Swiss techno-mecca, is the birthplace of the World Wide Web, leading some to infer that code 404 is a subtle reference to room 404 at CERN.

The only problem with that theory -- or, rather, that urban legend -- is that there is no room 404 at CERN, and there never has been. The real meaning and origin of the 404 code is far more mundane, with each digit having a specific significance.

WHAT DO THE NUMBERS IN STATUS CODE 404 SIGNIFY UNDER THE FORMAL HTTP SPEC?

No comments: