The Page-Based CMS is a Natural Byproduct of the Web

By Deane Barker • August 15, 2016 • 8 min read •

Author Description

The concept of how a “page” relates to content is a critical aspect of how a CMS works. The web has influenced this relationship.

The “page-based CMS” is not a bad thing. We try to pretend we don’t need to think about pages anymore, but most of the time, attempts to liberate ourselves from the notion of a page is just impractical idealism.

Furthermore, the page-based model doesn’t really matter that much, in the big picture. The acknowledgment of (and catering to) pages in a CMS is simply a practical concession that inflicts no harm.

This post (rant?) started with a tweet by my friend Carrie Hane:

Am I going too far to say to CMS companies: If you’re still thinking about pages, you’re doing it wrong.

Michael Boyink then chimed in, and what ensued was a rousing three-way discussion around the classic notion of “page-based CMS” versus more pure notions of content, separate from a “page.”

The “page based CMS” debate rolls around with regularity, it seems. Almost exactly four years ago, I discussed this same thing: What is a Page-Based CMS? In part, I said:

I think I’ve boiled it down to a litmus test: upon creation, does the core content object automatically get an addressable URL? If it does, then that would be a page – pretty much by definition – no matter what someone calls it.

Let’s back up a little this time and approach it from a distance.

No matter what your content looks like, how it’s modeled, or how it’s structured, a URL in a browser has to resolve to something. We call that thing a page. Sure, it may represent some other logical concept – a person, a news release, a location – but HTML in a browser is a page no matter what abstract thing we’re talking about. Content has to be delivered and the page is almost always how we do that.

Different CMSs have different relationships to a page. Roughly speaking, there are three models: Explicit, Implicit, and Ignorant.

Explicit: Content and pages are separate things. Pages are the only things that respond to a URL and they are containers for content. Much like you have to put a letter in an envelope to deliver it, you have to embed one or more content objects inside a page to make them accessible to content consumers. Sometimes, to make things more user-friendly to editors, creating content automatically creates a page and wraps the content in it, just to remove that rote step for content that will always need a page. Concrete5 uses the explicit model, so does DNN/Evoq, and so did the last version of TERMINALFOUR that I used. (Eight years ago, I was calling this the “composite page model.”)
Implicit: Content always implies a page. Any content object is automatically URL addressable, so there’s no need to create an explicit page – indeed, everything is a page by default. A URL will deliver a page representation of a single content object. The template of that object might drag other content into it, but it’s clearly centered around an object which is the target of the URL. In some cases, you can make content that does not respond to a URL, but this is less common than the default of everything getting a URL. This is, by far, the more common model: Episerver, Sitecore, WordPress, etc.
Ignorant: You have “pure” content and there is no page and no public URL. This is a content API, or even an enterprise content management system. These systems care nothing for your presentation/delivery, they just manage the content and assume some other infrastructure exists that will make pages out of it. These systems are essentially ignorant of the web itself. This is something like Contentful or Alfresco, or some of the pure decoupled systems. (This can also be the old school template-driven CMSs we were all writing back in the day with URLs like “/article.php?id=123”. In that case, the URL resolves to an executable file which drags the content out of the database. The content itself has no concept of the page or the URL. The logic that makes it a page is in the PHP file.)

The Ignorant Model is simple because there is no concept of a page to contemplate, so we can set it aside. But the Explicit or Implicit Models boil down to this question: do you embed content inside a page, or is the content a page itself?

For the Explicit Model, what exactly is a “page” anyway? Most importantly, and the thing most often just assumed, a page provides a URL. Remember, a URL has to resolve to something. When an inbound request is made to your web CMS, it’s going to be in the form of a URL and what conceptual thing responds to that? It’s going to be a page which is either (1) a container for one or more content objects (Explicit Model), or (2) an HTML representation of a singular content object (Implicit Model).

(I label this as the “the operative content object” in my first book, meaning it’s the content object on which the page is operating. In many templating systems, this is given specific credit in code. Episerver, for instance, has a “CurrentPage” property in their template models. Sitecore has a “Context Item.” WordPress has functions that don’t even require you to name the operative content object: “wp_title” just assumes that’s what you’re talking about. Lots of CMS provide ways to access the operative content object in the assumption that when a URL is requested, there will always be an operative object.)

The page might also have other page-specific information, like a TITLE tag, META tag data, etc. These things are concessions to the reality that it is, in fact, a web page that has to render in a browser. These things don’t exist in the pure, abstract notion of content, but only in the real, concrete representation of a web page. Other concessions might be things like menus, that also only exist for a content object as it appears on a page.

Finally, an explicit page is also often a presentation container with regions or placeholders or dropzones where you place content. Sometimes this is visual, drag-and-drop, in other times it’s through configuration or just raw assignment (“Display content #123 in region ‘main’.”). This tendency toward dynamic page composition (which I’ve also complained about before), leads some people to call these “page management systems” rather than “content management systems.”

And to fall further down this rabbit hole…are these “pages” themselves content objects? Do they version? Do they have permissions? If they’re managed under the same architecture as other content, then aren’t we just saying that it’s all content? We’re just embedding content inside content, and the “outer” content responds to a URL, which magically turns it into a page.

Why don’t we embrace the page? Most of us are working with content that will be delivered on the web in some form, so why do we still talk about “pages” pejoratively as if they’re somehow pedestrian? I think for two reasons:

We want to compose larger content structures, and we think that if everything is a page, we won’t be able to do that. We want, for example, a managed object called “Image Carousel” which is not a page, but rather something which we can embed in multiple other content objects. We think that if a CMS is “page-based,” we’re not going to have this opportunity.
We want to deliver content in multiple channels, both the web and everywhere else. We think “page-based” is constraining because what if we don’t want to deliver everything in a page? What if we want to manage our Facebook updates, and our tweets? These aren’t pages, and won’t that be a problem?

I don’t think either of these excuses are valid any longer.

My responses to the above claims, respectively:

Most systems – even those with a very solid, entrenched notion of a page – will allow content composition, and most allow some type of content that isn’t URL addressable (for example, Episerver and Drupal both have discrete, embeddable elements they call “blocks”). Some systems also allow you to simply cancel URL assignment for a specific object. If a content object can’t be accessed by a URL, then it’s not a page.
Even in an Implicit Model system that makes everything a page, there’s really nothing stopping you from grabbing that content, reconfiguring it, and shoving it out the door as something else. Sure that news release got a URL and some page-specific META, but no one says you have to use any of that in other channels.

We vilify pages, and we shouldn’t. Pages are a natural evolution of how this technology and industry has gotten to the place we’re at now. Indeed, given the last two decades, what else would we have done?

I read a book a couple years back called The Org: The Underlying Logic of the Office. This book was a response to all the claims of the last few decades that the average old guard business was totally dysfunctional. Rather than feign astonishment about how we got to this point, the book said something like this:

Organizations have evolved this way because the businesses and the people in them have required them to evolve this way. They have simply responded to incentives. We are how we are because that’s how we grew up.

And this is why we have pages.

The main driver of internet usage since the mid-90s has been the web. If you consider the entire world of content management, web content management comprises the vast majority of that. This may be a strong statement, but content management has evolved around the web for the last two decades. The web has pushed more activity and innovation in the discipline of content management than any other environment or factor.

And the web is about pages.

(Yes, apps are cool. But despite Wired’s claim that The Web is Dead, it hasn’t died yet. Five years ago, everyone wanted an app for situations that clearly just needed a good responsive website. This frenzy has since died down, and – surprise! – we all rallied around the web again.)

Ask yourself this: where is most of your content consumption taking place? Most all content management scenarios have a primary channel, and most of the time, this is the web. We often have secondary channels: Facebook, Twitter, email, etc, but the web is at the head of the line for most organizations.

Could you get by with a CMS that had no notion of a page? Most organizations couldn’t. Even if they did use something sophisticatedly abstract like Contentful (which I love, for the record), they would have to create some secondary infrastructure to convert their “pure” content into things that are URL addressable.

Management systems that are ignorant of the page necessarily have to depend on delivery systems that are not ignorant of the page. Why? Because a URL in a browser has to resolve to something.

All the content management in the world doesn’t help you if you can’t deliver it. Even a hedge fund manager has to call in a plumber occasionally.

In the end, do I think a “page-based CMS” is somehow of lesser purity than anything else? Absolutely not, so long as it has solid content modeling, embeddable content elements, and a good API that lets me access my content however I want. The page will usually always be primary. There’s nothing stopping you from doing other things.

We need to reconcile ourselves to the idea of the page. If a CMS vendor is thinking only about pages, then that’s clearly limiting (or not, depending on how they position and market their system). But if a vendor provides ways to reformulate content into other outputs and manipulate content effectively at the API level, then assuming a primary channel of pages/URLs is a perfectly acceptable concession to reality.

Selected Reader Comments

Like many blogs of its era, Gadgetopia allowed reader comments. Below are selected comments that were left on the original post.

The simplified conceptual model means that the natural structure of content is lost when it s stored in a page based CMS. By simplifying the content to fit the hierarchical page model, the reuse value of the content decreases significantly.

By: Lapteuh
When: about a week after the original post