Just What is Metadata, Anyway?

By Deane Barker

Content management authors and consultants are obsessed with “metadata.” You “add metadata” to your content, apparently to describe it and make it easier to organize, or something. You have “metadata” this and “metadata” that.

But here’s the thing: what is “metadata”? And how is it different from “data”?

When you first start learning about content management and start reading books written by people like Rockley and Boiko, you keep hearing about “metadata.” You get confused about this. What is it? How is it different from just good old data?

I examined this question in the middle of a session at Web Content 2009. I don’t remember the session, but I remember asking a question which launched into a discussion of just what metadata was. After this, I exchanged a couple Skype’s with Seth about it, and I think I’ve come a distinction.

Metadata is data added around content that cannot otherwise be structured.

Content management systems have their root in document management systems. For these system, the concept of “metadata” makes sense. The binary file – Word document, or whatever – was the “data.” You couldn’t add any structured information to this, so you tacked on “metadata” around it to help describe it.

I find another example of this with Ektron. Way back when, Ektron didn’t do structured content (let me emphasize that this was a long time ago). So, even today, when you edit content, you have a “metadata” tab that you can configure to contain certain fields of structured information. You did this because you couldn’t add structured information to the HTML – HTML just doesn’t hold that kind of information. So you have the “data” as the HTML, and the “metadata” as the structured information.

So, in these causes, we have content that cannot be structured – a binary file or raw HTML. To add specific, granular information to this, we have to have some other framework for it. Enter metadata.

So, is metadata still relevant? Not in most cases. In mainstream Web content management these days, there’s no need to claim to have separate “metadata” because you can usually always structure your content now. Data is data. Content is designed to be structured, and metadata would just be pieces of structured data, just like your page title or your page body.

Why is this important? Because the concept of “metadata” is confusing for people who have never had to use it. There are people who have never been involved in pure document management situations where it made semantic sense to say “data” was one thing and “metadata” was another, so this concept doesn’t make sense. In 99% of WCMS situations, there’s no difference. Data is data.

I remember way back when I was ready Rockley and Boiko for the first time, I kept thinking, “What is this ‘metadata’ they keep talking about, and why haven’t I ever used this?” And with Ektron, I never understood what the “metadata” tab was all about since I started using Ektron with structured XML right from the get-go.

In most cases, metadata is just data. If someone disagrees, I’d love to hear an argument to the contrary.

This is item #156 in a sequence of 357 items.

You can use your left/right arrow keys to navigate