The Content Tree

By Deane Barker

A while back, I mentioned the concept of a “content tree” in regards to content management. I cited this as a “functional pattern” and promised to talk about it more, but I never did.

So, here goes –

With every content management system (CMS) I’ve written, I always get back to the concept of a content tree. Additionally, every really good CMS I’ve seen has a content tree as the core structure: Documentum, Ektron, Interwoven, Zope, eZ publish, etc. It has become a pattern of content management if ever there was one.

Simply put, a content tree is a taxonomy – a parent-child structure – of content. You start at a root “folder” or “node” and build down from there. You might have a “folder” full of “article” objects, each of which might contain one or more “image” objects, etc. The idea is that a content object can be the child of another object, and the parent of one or more objects.

Obviously, this idea isn’t new, but I’m going to explain how I grasped this pattern one day. Perhaps my story will help you understand why this is such an important pattern –

I was building perhaps my 20th CMS (I think – they all kind of run together…). The CMS I was developing was “type-centric,” meaning content was grouped into “types” or “classes” (another pattern), and I used different types to organize the content. I would click a menu link for “articles” and get all the articles, for instance.

Type-centrism is really common in CMSs, and pretty much where everyone starts. You want articles, click the menu link and you get a list of them. Same for pages, authors, etc. It seems perfectly simple on the surface.

Then one day I added a type for “movie review.” I had to do a little programming, but I eventually had a shiny new type.

It turned out, however, that my “movie review” type was almost identical to my “article” type. They both had a title, a preview, an author, a body of text, an image, etc. But since my CMS was type-centric, they needed to be of different types to be separated in the system – I couldn’t very well click the “articles” link and get “movie reviews” mixed in, now could I?

Sometime later, I added a type for “book review.” Not surprisingly, this looked a lot like my “article” type too – it had all the same fields. With this, it became obvious that I was going to be doing this for a while.

So I decided I would create a “page” type instead, and add a property for “class” which would tell me what it was. This way I could recycle one type for many different uses. Brilliance!

So I had to hack away at my system for a while to separate content objects not only by “type” (page), but my “class” (article, movie review, etc.). It was done, and it worked.

But later, I decided that I wanted to separate “non-fiction” book reviews from “fiction” book reviews. How could I do this? I could create a new property on the “book review” class for “genre” …but I didn’t have a book review class anymore – I had lumped everything into “page.” And if I added “genre” to the “page” type (class? I was confused…), it would affect everything of that type.

Then it hit me – a content object is defined by two things: (1) its format (the properties it exposes), and (2) where it is located relative to other content. Two pieces of content may look the same (have the same properties), but they could be grouped into similar types: articles, movie reviews, etc.

So I quickly implemented a content tree – a recursive table of “folders” – and started assigning my content to these folders. My page class could now work just fine everywhere: if a “page” was in the folder for “articles,” then it was an article; if it was in the folder for “fiction” which was in the folder for “book reviews,” then it was a review of a work of fiction, etc.

This suddenly opened up other features, since any content object could be “assigned” to another one by virtue of being its child. Instead of having a single property for “image,” a content object could have as many images as it wanted.

I just needed to create an “image” object that could exist in the tree. If an article needed to be supported with 17 images, then the images could fit into the tree as children of the article. It become quite simple for an object to query the tree to collect all its children, so the organization and assignment of content objects suddenly became a breeze.

Additionally, it became easy to address different branches of the tree uniquely. I could decide that all the book reviews would be formatted a certain way, for example. If the system was rendering an object along that branch – however deep it might be – it could apply different images and styles. If I wanted non-fiction book reviews to look different from fiction books reviews, I could further subdivide the branches.

I also found that the content tree eliminated the need for a lot of properties. As I mentioned before, there was no need for “genre” when describing a movie review. That property was covered by where it sat in the tree.

This went on for some time – the content tree had opened up so many new possibilities, that it kept me busy for quite a while. And with that, I realized why most good content management systems have some sort of tree structure as the backbone of their content organization. This is obviously a simplistic overview, but it’s an accurate description of the epiphany that led me to this understanding.

If you’re contemplating a CMS project, consider this story carefully. I’m quite convinced that if you hack away on and refine your system long enough, all roads will lead you to this core concept – this functional pattern.

(Note: if you’re going to implement a content tree, implement the envelope pattern too. Don’t assign letters to the content tree, assign envelopes. Generally speaking, the content tree shouldn’t care about the letters.)

This is item #277 in a sequence of 357 items.

You can use your left/right arrow keys to navigate