Postscript: Thoughts on Model Interoperability

I started writing this guide as a follow-up to a blog post where I had decried the lack of a “content modeling standard.” I’ve also been banging this drum at conferences for the last couple of years.

I said this:

[…] we need to abstract and standardize the very idea of content. We need to come up with a common lens with which to view content types, content objects, properties, datatypes, values, and relationships in the ways they relate to WCM.

I still think this is missing from our industry, but I’ve come to understand and embrace the idea that there are three “levels” to be able to share content with some other consuming system.

Level 1 are the tools you have to model your own content (that’s what this entire guide has been concerned with)
Level 2 is the representation of that model which is presented to the public
Level 3 is the serialization and transmission method to get the data from System A to System B

As it turns out, how you model your content (Level 1) is no one’s business but your own.

This is akin to the programming concept of “interface” and “abstraction.” The content model interface of a CMS is what it exposes to the world, like a signature – the input and output parameters of a function. The content model implementation is how you actually model your content, which is the inner guts of that function which is no one’s business but the function itself.

So, Level 1 is not for external consumption.

Level 2 is how we frame the content model for the outside world to consume. This is public face of our content, regardless of how we actually modeled it using the features of our CMS.

We could still use some standards here. We need to agree on some basic concepts. For instance, do we expose the idea of parent-child relationships? Is an object’s status as a “child” of some other object a concept that an external system should know about? Is this a paradigm with enough utility we should all agree on it? If we retrieve a content object, should it be known that we can always ask for the children of it? Or ask for its parent?

This what standards like the Java Content Repository (JCR) and Content Management Interoperability Services (CMIS) have tried to do. Why those standards have never really taken off in the broader CMS industry is not something I’m qualified to answer (but that I wish someone would…).

Level 2 is essentially what we all decide to agree on, in terms of how content can exist, both discretely and relationally.

Level 3 is a communication detail. Lots of people feel very strongly here about their language of choice, but my feeling is that so long as we agree on Level 2, it doesn’t matter how we serialize and transmit – JSON, XML, rhyming verse, whatever. The concept of communicating a piece of data from one system to another is a problem solved multiple ways, and we should just let a thousand flowers bloom here.

So, to be clear, this guide has been about Level 1. Where I think we need to standardize as an industry is Level 2. Level 3 is another argument entirely, and I don’t think it’s a particularly necessary nor productive one.