The Necessity of Subcontent

By Deane Barker • Posted on May 20, 2007

Here’s something I don’t see nearly enough in content management systems: subcontent. This is when a content object contains other content objects as children. I don’t think I’ve ever built a content-managed site where I haven’t (1) used this when it was available, or (2) wished for it when it wasn’t.

It’s so common for a conceptual object in a content model to be composed of other content. This mirrors a relational database – the most common relationship between tables in a database is a one-to-many relationship where one thing in one table is related to many things in another table.

The same concept is at work here. Having a set of content objects existing within another is like an implicit one-to-many relationship. The parent is related to many children, and all of the children are related to one parent.

Most content management systems have some form of content tree, but often they fall into the “folder trap,” where they use the folder-and-file metaphor from operating systems. You can create folders, which then contain content.

I maintain that this is a mistake for a very important reason: the folder is not a content object itself. It’s a folder – some arbitrary structure thought up by the architects of the CMS, and it can’t carry content-based data or be managed like content (subject to workflow, versioning, etc.).

In practice, a content tree should effectively be made up of nothing but content objects stacked on top of one another. There are no “folders,” unless you want to create a “folder-like” content object. Each node in the tree is a content object, which has a content object as a parent and can have multiple content objects as children.

I was recently working with a magazine publisher. We discussed their content model, and it worked out like this:

A publication has multiple issues which each have multiple sections which each have multiple articles which each may have multiple subarticles. It was both highly relational and purely hierarchical – a perfect example of the need for subcontent.

With subcontent, this is simple. You have a publication content object which contains multiple issue content objects, etc. Since the tree is made up of content itself, representing these relationships is trivial. An article belongs to the issue which is its parent. Relationships are derived from their position in the tree.

If not for subcontent, then I have to maintain an explicit reference in each content object to define what other object it belongs to. Far from the simplicity of just putting content somewhere, now I have to have a content selector interface widget to go find the parent piece of content. The user experience also suffers because a piece of content could be the “child” of something way across the tree. The concept of “parentage” suffers because the parent and the child are not “geographically” related in any way in the administration interface.

Additionally, referential integrity is tightened with subcontent. Since you can’t delete a parent without deleting all of its children, you ensure there are no orphan records floating around.

Why does this all make so much sense? Because it’s correct from an object oriented perspective, with which we’re all familiar. The Composite Pattern says that an object can contain another object. Using this, you put together complex programming objects to represent complex concepts. The same is true of content modeling – you start gluing together content to create more complicated content.

To really do this right, there are some best practices:

You can either make these limitations global, or build them into the permissions model. Maybe a normal editor can’t put a downloadable file under an article, but the webmaster can.

(For the record, you need to allow arbitrary ordering even if you’re using file-and-folder organization. Content often needs to be ordered in non-derivable ways. I’m convinced content management systems try to sweep this fact under the rug simply because the interface issues are a pain.)

Ideally, your content tree is not made up of actual content. Instead it’s made up of “nodes” which are then linked to content objects. What’s a node? Nothing, really – they’re just things you can stack on top of each other to make a tree. Each node has a link to a content object, and one of those links is the “main” link. (Why do you need a main node? See this post.)

Why do this? Because now a content object can exist in more than one place in the tree. This is because a node is linked to a content object, and multiple nodes can link to the same object. One location is considered the “main” location, but it can appear in hundreds of places, if you like. This just means that hundreds of nodes are linked to the same content object.

This is a very powerful content modeling feature, and it lets you model content as accurately as possible.

So, what system does the content tree right? Well, eZ publish, of course. I’ve also seen hints of this in Zope. Are there others? I’d like to know – comments are open.

This is item #207 in a sequence of 356 items.

You can use your left/right arrow keys or swipe left/right to navigate