16 april 2011 0 Comments

HTML5 – Microdata: The Rise of the Semantic Web

Amidst all the enthusiasm surrounding HTML5, the focus in the media tends to lie on the inclusion of elements like <video> for displaying video and <canvas> to allow for scriptable rendering. When looking for information about HTML5, you will stumble upon loads of websites explaining how it will improve the overall user experience (without the use of flash), and how you can use tools like Modernizr to unleash the power of HTML5 right now. Then there are the articles and blogs that focus on the new elements web designers can use to improve the semantic structure of their markup, like <navigation> and <article>. This is all perfectly understandable and we’ve since long awaited these updates to the HTML standard. Personally however, I am equally interested to see how the adoption of the Microdata specification into HTML5 plays out. Though the new tags already add some additional semantic meaning to the structure of a site, HTML5 actually has more to offer on that area! Fortunately a (small) number of articles and blogposts have sprung up that do a good job explaining how you can use Microdata to embrace the semantic web, and further enhance your user’s experience while at your website. At the end of this post I compiled a small list of really good tutorials you should read if feel ready to get your hands dirty. However, if you’ve never heard of Microdata, how it came to be or what it is supposed to do, do read on!

No one is served by a bloated specification, and once an element is added to a specification, it tends to linger even if it is removed in a later version. For this reason, it’s the World Wide Web Consortium‘s (W3C) job to keep the number of elements in the HTML specification small. Webdesigning isn’t going to become any easier if the number of elements would double all of a sudden. On the other hand however, the limited amount of elements force webdesigners to be very unspecific when they wrap a piece of data into an element. And despite the introduction of elements like <header>, <nav> and <details> into the HTML5 specification, most HTML structures still lack clear semantic meaning. There is no <person>, <car>, <address> or <phonenumber> element for example. This means that by default the computer has no way to distinguish persons from cars when it draws the information onto your screen. Does this bother you? Probably not, since you can more often than not tell the difference quite easily anyway. We recognize a phone number when we see one, we know “Peter Jackson” is a name, whereas “the new Honda Prius” probably refers to a car.

However, you might be more appreciative when your browser would recognize an address for what it is, and present you the option of navigating to that address. Likewise, it could on occasion be handy if your smartphone browser would recognize a phone number, and allow you to speed dial it by a merely touching it with your finger. Adding semantic meaning to an HTML structure is not directly beneficial to us, but if we can get computers to understand what they’re showing us, a whole new world op possibilities opens up! Now this idea isn’t revolutionary, people have been trying to get additional semantic elements into the HTML specification for years. It has even been declared one of the key properties of the Web 3.0 by the founder of the W3C, Sir Tim Berners-Lee. But over 7 years ago, when the W3C made no attempt to include additional elements or attributes to HTML to allow for such semantic annotations, a small group of people decided to take the matter into their own hands, with the creation of the Microformats extension.

The Microformats specification was developed completely outside of W3C control and relied heavily on the use of the class-attribute in order to add semantic meaning. This attribute of course already existed and was originally added to the HTML specification with CSS in mind, instead of semantics. It didn’t invalidate HTML however and was the only real format available for quite some time, which is why sites like Google and LinkedIn adopted it early on. Its competitor Resource Description Framework in attributes (RDFa) arrived late at the scene and is a completely different story. Development of RDFa started in 2004, when it was officially included into the W3C XHTML 2 specification. When the W3C pulled the plug on XHTML 2 in 2009, RDFa was saved and copied into a separate specification for use with HTML, called HTML+RDFa. It quickly rose to fame when in early 2010 it was adopted by Facebook in order to power their Open Graph.

Both Microformats and RDFa annotate HTML elements with semantic meaning, though both take a completely different approach. Microformats (ab)uses the class-attributes, whereas RDFa relies on XML constructs like namespaces. Both have managed to gain some traction and are currently in use throughout the world wide web. With RDFa now officially connected to HTML, the W3C had finally managed to introduce its own specification for annotating HTML elements. So why on earth would the W3C invest time in creating another annotiation specification? In short: they didn’t.

Insert the WHAT Work Group. In 2004, when the W3C decided to focus on XHTML instead of HTML, a few individuals from Apple, Mozilla and Opera started their own work group in order to make sure HTML would continue to evolve. It was they who came up with new elements like <video>, <canvas> and <header>, you have all fallen in love with. It took until 2006 before the W3C realized it had made a mistake and came knocking on WHATWG’s door. They had seen the light and wanted to cooperate with the WHATWG on a new HTML specification. They adopted nearly everything the WHATWG gad bene working on and continued from there. Now you might expect the WHATWG had already developed the Microdata spec before the W3C contacted them, but this wasn’t the case. In fact, it was not until 2009 before the Microdata specification came into existence. At the WHATWG, Ian Hixon was in charge of HTML 5 and in early 2009 he made adding semantics to HTML 5 one of his priorities. It was he who came up with the Microdata specification.

Unlike RDFa, Microdata was meant for HTML from the start. It had however been developed by just one man, whereas RDFa was created by a group of people. With all the effort that had gone into the creation of the RDFa spec, some people weren’t very happy with Microdata being part of the official HTML5 specification just like that. After much debate, the W3C decided to move Microdata from the official HTML 5 specification to its own spec. However, it is still part of the official WHATWG HTML 5 specification. This might sound strange, but we’ll have to learn to live with the fact that the opinion of the WHATWG and the W3C’s differ from time to time.

So, now you know where Microdata came from and what it has to offer. Check out the following links if you want to learn more, and to see how you can use it in your own projects:

Tags:

Leave a Reply