Composite Pages and Embeddable Content

By Deane Barker

In Web content management, you can identify two types of pages fairly easily:

  1. Free-form: A title and a big WYSIWYG area.

  2. Structured: More database-like – data is structured into components and assembled into a page.

(We discussed the difference in detail in a post called “To Structure or Not to Structure.”)

However, I’ve often found that there’s something in the middle. You need a big WYSIWYG area, but you need structured areas within it. This can get tricky.

This is easily explained with an example, which will probably ring true for a lot of content management developers –

For your company intranet, you have a “Page” template. All is good.

But then your users suddenly want to embed calendars from the Exchange server into their pages. Okay, you say, and you build a “Calendar Page” data type and template. The data type includes a new “Calendar URL” field, and the template includes the code to render the calendar.

You now have two templates.

Later, the users come back, and they want a photo gallery. So, you create a “Photo Gallery” datatype and template, which allows them to upload photos, and the template renders the gallery.

You now have three templates.

Later still, the users come back and say, “This is great, but we want this page to have a calendar AND a photo gallery in it. So, we want a bunch of text, then a calendar, then a bunch more text, then a photo gallery. And maybe another calendar after that…”

What now, Einstein?

This is where you start seeing some really bizarre things done to try and jump through these hoops, and jump through the next one, and the next one, etc. Before long, everything becomes totally unmanageable. The users are irritated because they think the system is inflexible, and the developer is irritated because he thinks the user are being unreasonable, etc.

However, all is not lost. Over the years, I’ve seen and implemented two good ways to handle situations like this.

1. Composite Pages

A composite page is a single page, made out of pieces. Your users can add a “Page” to the system, then add “Sections” to the page. They can pick from different types of sections, like “text with header,” “text with image,” “centered image,” etc.

The idea is that you stack these sections on top of each other, order them, and when the page is rendered, its sections are rendered in sequence from top to bottom.

The benefit here is that the page is broken down into sub-objects, and each of these can contain their own data structures and functionality. You could have page sections for “Calendar” and “Photo Gallery.” Intersperse those with a couple “Generic Text” blocks, and you have a much more flexible page structure.

There are a few disadvantages to this model –

First and foremost, editing can be a pain. Often users just want to edit their pages without having to juggle more than one content object. Additionally, the addition of even one structure element into the center of the text breaks the page up into at least three elements.

Consider the example of a single-section page of text, into which a user wants to inject a calendar. How do they do this?

Users will have to create the calendar section, then create another text section to go under it. Then they have to transfer some text from the top section into the bottom section, so the calendar appears to be in the middle of of an otherwise unbroken flow of text. To move the calendar “up” in the page means transferring text from the bottom text section to the top text section.

So, the editing model can be complicated. I’ll admit that I haven’t seen a composite page interface that has leveraged all the Ajax-y goodness available these days. Joe and I have often theorized that it could be done much today than in years past, which could probably mitigate the editing pain that I’ve experienced.

Second, breaking what was a single object into multiple objects presents some other issues. What do you do about some of the higher level functions: versioning, workflow, and permissions? When you save a composite page, are you versioning the sections independently or as a group (if as a group, do you save new versions of sections that haven’t changed)? Can you send individual sections through workflow? Can you set specific permissions on sections? Etc.

Finally, the composite page model is optimized for pages that have “horizontal bands” of content. What happens when people start to float images to the right and left, and elements start to “hang over” other elements, either inadvertently, or on purpose? The formatting issues can get complicated here, and you might end up with users irritated that they can’t do something which seems simple, but ends up being complicated or impossible.

(I’ve seen this done once where the composition model was extended to columns as well, so a “content row” could be split into multiple “content columns” which each contained different objects. However, this gets even more complex since, on the Web, horizontal space is usually much more rigid than vertical space. So what happens if someone divides a row into 20 slivers and then tries to cram a photo gallery into one of them? The result is that all the possible objects would have to be designed as width-neutral, which is far easier said than done.)

I haven’t seen a commercial system that uses a composite page model. I’ve seen it done several times in one-off systems built by others, and I tried it once years ago. I’d love to see a a really high-end implementation of it.

2. Embedded Content

This model is more flexible than a composite page model. It considers each page as a big WYSIWYG area, into which you can insert and format text. However, in addition, you can position references (“markers”) to more complicated content, which is then “expanded” and inserted into its parent content when the page is rendered.

So, to continue our embedded calendar example from above, your users would have a big field of text, and somewhere in it would be a marker of some kind. When rendered on the public side of the site, this marker would be “caught” and expanded into a full-blown calendar.

What is the marker? It needs to be a text string that (1) allows the passing of variables, and (2) is easily parsed during rendering.

Systems using a good XML editor could allow users to put a custom XML tag. This is handy because you can validate it, and it’s quite easy to pass variables in with it, XML being a data serialization engine and all. So, your calendar marker could look like this:

<calendar url="http://somewhere"/>

Templating engines like PHP’s Smarty do this well too. In Smarty:

{Calendar Url="http://somehwere"}

There are a number of benefits to the embedded content model:

First, the user model is solid, because users can manipulate the page as a single block of content, which usually makes more sense to them than stacking independent objects on top of each other.

Second, the position the calendar in the page is simply a matter of where the marker is. Move it up or down within the same block of text.

Finally, the position of the embedded content can be affected by what’s around it. If users want to align the calendar to the left, simply enclose the marker in a floated DIV (provided there are no width issues).

And, there are drawbacks, which are subtle, but can be significant:

First, there are challenges to insert source code-ish markers in WYSIWYG environments. If you use XML, will it show in WYSIWYG mode? Will editors remember where it is? How will the editor handle and display it? Some editors handle this better than others. For example, if you drop a custom XML tag in the source, eWebEditPro from Ektron can replace the custom tag with an icon in WYSIWYG mode.

Second, users have to know the structure of the markers and how to embed them. Do you trust them to form an XML tag? In a perfect world, you can pop a separate window with a form that creates the tag and embeds it, then replaces it with an icon. Sadly, not all editors are going to support this.

Third, what happens when data required by the embedded content becomes invalid? What happens when the calendar in the Exchange server is deleted? Do you somehow identify content that embeds the calendar and…do what? Alert someone? Remove the marker? Do nothing and just throw an error on rendering?

In general, embedded content suffers from being less identifiable to your system. With a composite page model, each page section is likely stored as separate database record. But embedded content is nothing but a small text fragment inside another record, which is not easy to query (unless you’re using XML and your database supports XML indexing, then you’re in great shape).

You’ll quickly get to a point where you need to identify and index all the embedded content when you save a record. Keep an index in a separate database table which records what embedded content with what variables is contained inside what content. This makes it easy to find all the content that references Calendar X.

Fourth, what about permissions? How do you ensure users are only embedding content that they’re allowed to embed? It’s unlikely you can catch this real-time in the editor, so you’d have to try and catch it on save, which is less-than-ideal from a usability perspective.

Finally, you’re giving a lot of power to the users here. If they load their page up with 159 calendars, what happens to server when it renders? You’re essentially allowing the users to force your server to execute extra code based on what they put in their content. Whether or not this is a problem is up to you.

I’ve seen this model in a lot of systems:

  1. eZ publish has an embeddable content model, which is built-in and fairly sophisticated.

  2. I’ve built this with Ektron since eWebEditPro will let users embed custom tags and has fairly good support for editing them. When rendering, my system parses the XML for tag patterns, then replaces them with the output of corresponding .Net user controls.

  3. Years ago, we wrote about a plugin for Movable Type called Builderoo which enable the definition of macros to be inserted in posts.

  4. A little-known open-source system called Etomite (which I really enjoyed playing with) allows you to define “snippets” which map to executable PHP code.

  5. I haven’t looked at this in ages, but Joomla used to have things called “mambots” (a carryover from when it was named “Mambo Server”) which allow you to embed functionality in pages.

In the end, the embedded content model is tougher to implement, but, done correctly, is considerably more flexible and powerful.

This is item #190 in a sequence of 357 items.

You can use your left/right arrow keys to navigate