Thursday, November 15, 2007

Add meaning to Web pages with microformats

One thing that makes the Web so interesting is that it’s constantly changing, as new technologies and techniques are introduced on an almost daily basis. A technology that has gathered steam the past couple years is microformats. Microformats allow you to add context to existing information contained within a Web page.

The semantic Web

The semantic Web is an evolving extension of the Web. It says that Web content can be expressed not only in natural language, but also in a format that can be read and used by software agents. This allows the software agents to more easily find, share, and integrate information.

While the semantic Web is designed for machines first, microformats are designed for humans first. The goal of microformats is to create a web of data that anyone can publish, consume, and so forth. There is a low barrier to entry for the microformats concept, so anyone with an understanding of XHTML can easily publish their own microformats.

What are microformats?

Microformats provide a more formalized technology for adding commonly used semantics to today’s Web. Microformats are a set of open data formats that use existing technology and standards, most notably XHTML. The microformats site makes the following assertions about the technology:

  • Microformats are a way of thinking about data.
  • Microformats are design principles for formats.
  • Microformats are adapted to current behaviors and usage patterns.
  • Microformats are highly correlated with semantic XHTML.
  • Microformats are a set of simple data formats that many are actively developing and implementing.

A key concept is the usage of existing technologies (such as XHTML) that have been well tested. This allows developers to focus on the data as opposed to the technology.

Microformats in practice

A common application of microformats is providing contact or event data. The hCard microformats specification provides a guideline for including contact data within a Web page.

The hCard standard is a simple, open, and distributed format for representing people, companies, organizations, and places. It closely follows the vCard standard. The hCard standard defines specific elements for defining pieces of data.

The different data elements are specified using the class attribute (all class names are lowercase). The complete contact card is comprised by the vCard class, so this class is applied to a DIV element that contains the complete contact information. Individual data elements on the card are designated with the appropriate class name. For example, a person’s state is designated by the region class.

The following listing provides a look at a possible hCard for myself. It lists my name, organization (TechRepublic.com), city (Louisville), state (Kentucky), and country (USA).


Tony Patton
TechRepublic.com


Louisville,
KY
USA

I could easily include this data in a Web page since it is standard XHTML. The data could be easily read by other applications that understand the hCard format. Also, the data could easily be formatted for presentation using standard CSS since the data is contained within basic XHTML.

The hCard Creator tool provides an easy way to assemble the appropriate hCard for a contact. Another common use of microformats is for providing information about events. This is accomplished with the hCalendar format.

The hCalendar specification is an open standard based on the iCalendar standard. The hCalendar format follows the approach used by the hCard standard; that is, class names are used to tag data elements.

The complete event is contained within a DIV element and assigned the vevent class name. Individual aspects of the calendar entry are contained within this DIV element. The start and end dates are marked by the dtstart and dtend class names with the title attribute containing the full date. The following illustrates a sample event for last week’s Web Development column.



November 6th,
2007
Web Development

Weekly Web Development column.

The hCalendar Creator is available for marking up your own calendar entries. (Note: I could not get it to work in Internet Explorer 7, but it worked fine in Firefox.) Like hCard data, you can easily present the data on a Web page and style it with CSS — while the data is still available in the hCalendar format for use by other applications.

Industry support

I’m happy to say that the IT industry is finally starting to embrace microformats. Yahoo! has been a big proponent of microformats from its inception. In addition, the Eventful site uses them, as does the photo-sharing site Flickr. Even Microsoft recognizes the technology, as proven in this blog post about using microformats with SharePoint. The Twitter site also embraces the hCard standard. Firefox offers the Operator add-on to provide microformats support within the browser.

There are various tools for working with microformats in numerous development languages. A good example is Sumo, which offers a microformats parser for the JavaScript language. A Perl module is available with the Text::Microformat, which offers a microformat parser for Perl.

Adding context

A key concept of the microformats technology is that they are designed for humans first and machines second. The sole purpose of microformats is to create larger, more reliable webs of data, published by more people. The microformats approach is the low-cost, efficient way to build a web of data. Learn more about the various microformats currently available on the microformats site as well as those covered in this article.


No comments: