HTML5 blockquotes & microdata, like two peas in a pod

With HTML’s academic roots, it’s no surprise that there are multiple elements for referencing another work: q, cite and blockquote have been around for ages. What is surprising, is weak they are semantically.

There was no structured way to add semantic meta data to, for example, the blockquote, except for the ill-used cite attribute. The cite attribute can only contain an URL to an online reference to a work & isn’t really used by any modern user agent nor often used by authors.

Blockquotes, back in the day

So, what people did, back in the old HTML4/XHTML1 days, was putting meta data, often using the cite element, around or inside the blockquote element.

<blockquote>
	<p>This usage of a definition list is proof that writing W3C specifications and smoking crack are not mutually exclusive activities.</p>
</bockquote>
<cite>Jeremy Keith, Incite a riot - 24 Ways</cite>

or

<blockquote>
	<p>This usage of a definition list is proof that writing W3C specifications and smoking crack are not mutually exclusive activities.</p>
	<p><cite>Jeremy Keith, Incite a riot - 24 Ways</cite></p>
</blockquote>

The paragraph tags were necessary, because the older HTML specs mentioned that only block-level elements can be direct descendants of a blockquote.

Blockquotes, the html5 way

Both the examples above are invalid in HTML5:

This, at first glance, is insane. The new spec is very different from how quotes were traditionally marked up. These changes fly right in the face of two of the design principles of HTML5: graceful degradation & paving the cowpaths.

Jeremy Keith even asked us to in<cite> a riot and keep using the old structures, regardless of the specs.

HTML5Doctor came up with a more modern, but still not correct, solution: using the footer element inside the blockquote for citations.

But, when you think about it, semantically the changes make sense: you cite a work, not a person & the citation itself is not part of the quote.

Hixie to the rescue: using figure and figcaption

So how can we semantically connect the blockquote, the citation and other meta data?

Ian Hickson came up with a solution on the WhatWG mailing list: wrap the blockquote in a figure and the meta data in a figcaption element. At first glance, I thought it was a forced solution, but after rereading the specs it’s elegance slowly dawned on me.

So then it’s this:

<figure>
	<blockquote>
		This usage of a definition list is proof that writing W3C specifications and smoking crack are not mutually exclusive activities.
	</blockquote>
	<figcaption>
		Jeremy Keith, <cite>Incite a riot</cite> - 24 Ways
	</figcaption>
</figure>

Much better.

Note that with the current HTML5 spec we don’t need to wrap our quote in block-level elements.

Lately this solution has gained traction, an example is added on WhatWG and even an article on A List Apart has appeared about it, beating me to the punch. In any case, it looks like this markup pattern is here to stay.

Semantic richness with added microdata

But looking at the example above, there are still some ambiguous elements. We (“we” being a parser) know there is quoted text, we know which work it originates from and we know there’s some other vague stuff related to the quote. The vague stuff being the name (Jeremy Keith) & the publisher (24 Ways). This stuff has no semantic context.

Luckily there’s a rarely used microdata property, which is perfect for using with quotes: mentions. This property is available for all CreativeWorks, including Article and BlogPosting.

Even better, mentions references a Thing and a Thing can be anything: a person, a movie, an article, etc. So we can nest the referenced Thing scope in the figure. Here’s an example:

<figure itemprop="mentions" itemscope itemtype="http://schema.org/Article">
	<blockquote>
		This usage of a definition list is proof that writing W3C specifications and smoking crack are not mutually exclusive activities.
	</blockquote>

	<figcaption>
			<span itemprop="author">Jeremy Keith</span>, 
			<cite itemprop="name">
				<a itemprop="url" href="http://24ways.org/2009/incite-a-riot/">Incite a riot</a>
			</cite>,
			<span itemprop="publisher">24 Ways</span>
	</figcaption>
</figure>

100% semantic goodness.

Well almost, the blockquote content itself has no microdata attached. Unfortunately there is no excerpt or similar property in the microdata spec and neither text, about nor description seem to fit the bill.

This is what Google Rich Snippets tool makes of it:

Results of the Google rich snippets tool of a blockquote wrapped in a figure element with microdata
Results from Google Rich Snippets tool of a blockquote wrapped in a figure element with microdata

View gist of the code snippet.

Pretty decent, right?

And then there is the microdata citation property, but that one is currently only available for MedicalScholarlyArticle.

Damn doctors have it all: money, prestige, surrounded by hot nurses & a sexy microdata citation property in their scholarly articles. I must have made some wrong career choices along the way. And you can quote me on that.