The Necessity of Subcontent
The ability to organize content into trees consistent of parent-child relationshps is a core feature of content modeling, and resolves so many modeling patterns
The document discusses the necessity of subcontent in content management systems, where a content object contains other content objects as children. It argues that most systems fall into the “ folder trap” and use folders instead of content objects, which is a mistake as they are not content objects and cannot be managed like content. The document also highlights the importance of defining what type of content can be placed under what type of other content, considering recursive references, and allowing arbitrary ordering of subcontent.
Generated by Azure AI on June 24, 2024Here’s something I don’t see nearly enough in content management systems: subcontent. This is when a content object contains other content objects as children. I don’t think I’ve ever built a content-managed site where I haven’t (1) used this when it was available, or (2) wished for it when it wasn’t.
It’s so common for a conceptual object in a content model to be composed of other content. This mirrors a relational database – the most common relationship between tables in a database is a one-to-many relationship where one thing in one table is related to many things in another table.
The same concept is at work here. Having a set of content objects existing within another is like an implicit one-to-many relationship. The parent is related to many children, and all of the children are related to one parent.
Most content management systems have some form of content tree, but often they fall into the “folder trap,” where they use the folder-and-file metaphor from operating systems. You can create folders, which then contain content.
I maintain that this is a mistake for a very important reason: the folder is not a content object itself. It’s a folder – some arbitrary structure thought up by the architects of the CMS, and it can’t carry content-based data or be managed like content (subject to workflow, versioning, etc.).
In practice, a content tree should effectively be made up of nothing but content objects stacked on top of one another. There are no “folders,” unless you want to create a “folder-like” content object. Each node in the tree is a content object, which has a content object as a parent and can have multiple content objects as children.
I was recently working with a magazine publisher. We discussed their content model, and it worked out like this:
A publication has multiple issues which each have multiple sections which each have multiple articles which each may have multiple subarticles. It was both highly relational and purely hierarchical – a perfect example of the need for subcontent.
With subcontent, this is simple. You have a publication content object which contains multiple issue content objects, etc. Since the tree is made up of content itself, representing these relationships is trivial. An article belongs to the issue which is its parent. Relationships are derived from their position in the tree.
If not for subcontent, then I have to maintain an explicit reference in each content object to define what other object it belongs to. Far from the simplicity of just putting content somewhere, now I have to have a content selector interface widget to go find the parent piece of content. The user experience also suffers because a piece of content could be the “child” of something way across the tree. The concept of “parentage” suffers because the parent and the child are not “geographically” related in any way in the administration interface.
Additionally, referential integrity is tightened with subcontent. Since you can’t delete a parent without deleting all of its children, you ensure there are no orphan records floating around.
Why does this all make so much sense? Because it’s correct from an object oriented perspective, with which we’re all familiar. The Composite Pattern says that an object can contain another object. Using this, you put together complex programming objects to represent complex concepts. The same is true of content modeling – you start gluing together content to create more complicated content.
To really do this right, there are some best practices:
- You need to be able to specify what type of content can go “under” what type of other content. Referring back to my magazine example, it wouldn’t do us much good to have a publication appear under an issue.
You can either make these limitations global, or build them into the permissions model. Maybe a normal editor can’t put a downloadable file under an article, but the webmaster can.
Watch out for recursive references. If Object A is the parent of Object B, ensure there’s no way Object A can also be “downhill” from Object B.
You need to consider the ordering of subcontent. Specifically, you need to provide arbitrary ordering. There needs to be a way for a content manager to specify the order in which subcontent appears under a parent.
(For the record, you need to allow arbitrary ordering even if you’re using file-and-folder organization. Content often needs to be ordered in non-derivable ways. I’m convinced content management systems try to sweep this fact under the rug simply because the interface issues are a pain.)
- If your system supports a content tree made up of content nodes, then you’re doing really well. However, there’s one more level of abstraction you can make if you like:
Ideally, your content tree is not made up of actual content. Instead it’s made up of “nodes” which are then linked to content objects. What’s a node? Nothing, really – they’re just things you can stack on top of each other to make a tree. Each node has a link to a content object, and one of those links is the “main” link. (Why do you need a main node? See this post.)
Why do this? Because now a content object can exist in more than one place in the tree. This is because a node is linked to a content object, and multiple nodes can link to the same object. One location is considered the “main” location, but it can appear in hundreds of places, if you like. This just means that hundreds of nodes are linked to the same content object.
This is a very powerful content modeling feature, and it lets you model content as accurately as possible.
So, what system does the content tree right? Well, eZ publish, of course. I’ve also seen hints of this in Zope. Are there others? I’d like to know – comments are open.