There are two widely used formats for structured data: microdata and JSON-LD. With JSON-LD, a JavaScript object is inserted into the HTML of your page to define data, whereas microdata uses HTML tags and attributes to define data.
Microdata is part of the WHATWG HTML Standard and is used to nest metadata within existing content on web pages. Search engines and web crawlers can extract and process microdata from a web page and use it to provide a richer browsing experience for users.
RDFa (Resource Description Framework in Attributes) is a type of data format recommended by the World Wide Web Consortium (W3C) for embedding RDF statements in HTML, XHTML, and various XML dialects. Programmers use the framework (RDF) to further specify web content with metadata.
Microdata is a specification to embed machine-readable data in HTML documents. Microdata consists of name-value pairs (known as items ) defined according to a vocabulary. A collection of commonly used markup vocabularies are provided by schema.org.
While there are many (technical, smaller) differences, here’s a selection of those I consider important (used my answer on Webmasters as a base).
As W3C’s HTML WG found no volunteer to edit the Microdata specification, it is now merely a W3C Group Note (see history), which means that there are no plans for any further work on it.
So the Microdata section in WHATWG’s "HTML Living Standard" is the only place where Microdata may evolve. Depending on what gets changed, it may happen that their Microdata becomes incompatible to W3C’s HTML5.
Update: In 2017, work started again, with the aim to publish Microdata as W3C Recommendation.
RDFa is published as W3C Recommendation.
Microdata can only be used in (X)HTML5 (resp. HTML as defined by the WHATWG).
RDFa can be used in various host languages, i.e. several (X)HTML variants and XML (thus also in SVG, MathML, Atom etc.).
And new host languages can be supported, as RDFa Core "is a specification for attributes to express structured data in any markup language".
In Microdata, it’s harder, and sometimes impossible, to use several vocabularies for the same content.
Thanks to its use of prefixes, RDFa allows to mix vocabularies.
Microdata doesn’t provide a way to use reverse properties. You need this for vocabularies that don’t define inverse properties (e.g., they only define parent
instead of parent
& child
). The popular Schema.org is such a vocabulary (with only a few older exceptions).
While the W3C Note Microdata to RDF defines the experimental itemprop-reverse
, this attribute is not part of W3C’s nor WHATWG’s Microdata.
RDFa supports the use of reverse properties (with the rev
attribute).
By using Microdata, you are not directly playing part in the Semantic Web (and AFAIK Microdata doesn’t intend to), mostly because it’s not defined as RDF serialization (although there are ways to extract RDF from Microdata).
RDFa is an RDF serialization, and RDF is the foundation of W3C’s Semantic Web.
The specifications RDFa Core and HTML+RDFa may be more complex than HTML Microdata, but it’s not a "fair" comparison because they offer more features.
Similar to Microdata would be RDFa Lite (which "does work for most day-to-day needs"), and this spec, at least in my opinion, is way less complex than Microdata.
If you want to support specific consumers (for example, a search engine and a browser add-on), you should check their documentation about supported syntaxes.
If you want to learn only one syntax and have no specific consumers in mind, (attention, subjective opinion!) go with RDFa. Why?
Note that you can also use several syntaxes for the same content, so you could have Microdata and RDFa (and Microformats, and JSON-LD, and …) for maximum compatibility.
Here’s a simple Microdata snippet:
<p itemscope itemtype="http://schema.org/Person">
<span itemprop="name">John Doe</span> is his name.
</p>
Here’s the same snippet using RDFa (Lite):
<p typeof="schema:Person">
<span property="schema:name">John Doe</span> is his name.
</p>
And here both syntaxes are used together:
<p itemscope itemtype="http://schema.org/Person" typeof="schema:Person">
<span itemprop="name" property="schema:name">John Doe</span> is his name.
</p>
But it’s typically not necessary/recommended to go down this route.
The main advantage you get from any semantic format is the ability for consumers to reuse your data.
For example, search engines like Google are consumers that reuse your data to display Rich Snippets, such as this one:
In order to decide which format is best, you need to know which consumers you want to target. For example, Google says in their FAQ that they will only process microdata (though the testing tool does now work with RDFa, so it is possible that they accept RDFa).
Unless you know that your target consumer only accepts RDFa, you are probably best going with microdata. While many RDFa-consuming services (such as the semantic search engine Sindice) also accept microdata, microdata-consuming services are less likely to accept RDFa.
It's long, but one of the most comprehensive answers you'll get to this question is this blog post by Jeni Tennison: Microdata and RDFa Living Together in Harmony
I'm not certain if unor's suggestion to use both Microdata and RDFa is a good idea. If you use Google's Structured Data Testing Tool (or other similar tools) on his example it shows duplicate data which seems to imply that the Google bot would pick up two people named John Doe on the webpage instead of one which was the original intention.
I'm assuming therefore that using one syntax for a given item is a better idea (you should still be able to mix syntaxes as long as they describe separate entities).
Though I would be happy to be proven wrong on this.
I would say it largely depends on the use case: For Scientific use cases RDF is common and used in different aspects.
For enriching Websites JSON-LD is now recommended, at leas by Google.
A JavaScript notation embedded in a tag in the page head or body. The markup is not interleaved with the user-visible text, which makes nested data items easier to express, such as the Country of a PostalAddress of a MusicVenue of an Event. Also, Google can read JSON-LD data when it is dynamically injected into the page's contents, such as by JavaScript code or embedded widgets in your content management system.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With