Eval Criteria # 17

How can rich text fields be structured?

As we discussed in the chapter on type inheritance, many content types fit the same pattern: some structured fields around a Body of rich text.

An Article has a Title, Author, Published Date, etc., and also a rich text Body field that’s the main content of the article. An Employee Bio would have the same pattern: First Name, Last Name, Job Title, and…a big rich text field for the Bio.

This is such a common pattern that a lot of systems are built around it. It’s not coincidental the built-in model for many systems starts with a Title and Body field. The Title has to be there so the system could show the content in lists and menus, and the the Body was there because, well, that’s the content.

We often don’t structure the body field. It’s just rich text, composed in some WYSIWYG tool, and we let editors do whatever they want in there, using whatever formatting tools available in their editor. (Which is, in itself, often a source of great debate.)

However, the industry is trending toward more structure in rich text. What used to be just a big field of HTML is becoming an aggregation of special, embedded content objects. This has considerable impact on how you might model your content.

Using Structured Rich Text to Limit Types

Structuring rich text is helpful because it often removes the need for many special types that differ only in displayed elements.

For example, if your editors want a Photo Gallery Page, you might conclude this differs from a Text Page only in that an image carousel appears somewhere in the text. It might be at the top, or under introductory paragraph, or wherever, but if you remove that, you’re right back to a Text Page.

Creating an entirely new type to add one attribute (the image carousel) has a detrimental effect on manageability, so it might be easier to just create an image carousel as some type of embeddable element that can be inserted into the rich text attribute already present in Text Page. In doing this, we’ve essentially stated any Text Page can become a “Photo Gallery Page.” That isn’t actually a real type, it’s just a Text Page object in which an editor decides to embed an image carousel.

Yes, you could use type inheritance here as well – you could create Photo Gallery Page and inherit it from Text Page then add the extra attribute. However, you will run into the problem we discussed in the chapter on inheritance and composition in that your new type can only inherit from one parent, and this means that you’ve implicitly stated your image carousel only extends one type of content.

By allowing the embedding of image carousel in rich text, you’ve creating a condition whereby any rich text field on any type can contain an image carousel. If Mary wants to display an image carousel of the company picnic she organized on her Employee Bio page, she can do that now because it also contains a rich text field. (This may or may not be what you want to allow of course, but we’ll get to that.)

By the inclusion of active elements, rich text on any type becomes a generic container for both formatted HTML and other presentational elements. Previously, rich text was just a series of paragraphs stacked on each other, but editorial tools have advanced to the point where we can alter some items in that “stack” to be more than just paragraphs.

Editing a section of rich text in the new “Gutenberg” editing interface in WordPress. In this example, we are selecting and inserting a new element between two headings.

Stacking vs. Embedding

Rich text gets structured in two common ways.

Stackable Elements: Some systems allow editors to “stack” elements to be delivered as an unbroken body of content. In stackable models, editors can add a “Header,” for example, then a “Paragraph”, then an “Image,” etc. Each element is its own content type, and they stack from top to bottom. When rendered, they appear to be a complicated, unbroken stretch of rich text.
Embeddable Elements: Other systems allow “embeds” inside a body of rich text, where editors can either place (1) small text strings (shortcode is a common name for this) which are replaced at render time. Some WYSIWYG tools allow drag-and-drop functionality for editors. These systems usually create a placeholder HTML construct that’s replaced during rendering.

The two models are subtly different. With stacked elements, the “body” is just a container for elements – any text present is in one of those elements, and there is no single rich text field. The rich text is formed solely by stacking elements on top of each other, so they’re not optional. If there were no elements, there would be nothing to show.

Using embedded elements, the rich text can exist without the use of any elements. Only if desired, it can also contain embedded elements between paragraphs (embedded elements are usually always block-level, for some reason though there’s nothing preventing them from being inline). Think of the rich text as water flowing around “islands” of embedded elements.

The difference between the two models of structuring rich text. The key is that with stacked elements, there is no “outer” rich text. The only text is contained inside the elements. With embedded elements, the rich text exists, but might be enhanced with elements.

Structured Elements vs. HTML Structures

Another key point is that these elements – whether they be stacked or embedded – all imply some separation of content and presentation. Many rich text editors will offer buttons and dialogs to form HTML elements, but those elements simply get embedded as their actual, final HTML manifestation. Once created, those HTML tags “forget” how they were ever created in the first place. They’re the same as if you wrote them manually.

With stacked or embedded elements, the rich text field is structured from data elements rendered into HTML in the delivery context at the time of request. They are stored as a data construct, and are templated as a small, embedded content object.

An embedded element, for example, inserts some type of “placeholder” into rich text. Often, this is a block-level element that can serialize data, such as a DIV with multiple data- attributes, or containing serialized JSON. During the delivery context, the rich text is parsed, this element is detected, and is “expanded” into whatever it was intended to be.

Stacked elements are essentially content objects in their own right. There are likely a set of special types designed to stack as rich text – things like Text with Header, Image with Caption, Video Player – and the entire body of rich text is just a series of objects of these types stacked on top of one another. During the delivery context, these are rendered consecutively, resulting in a body of rich text.

Structuring rich text fields is becoming more common, and it can drastically increase the flexibility of your content model. More complicated constructs can be included in rich text, and presentational elements that would previously be modeled types which were fixed in place on specific template locations have much more flexibility

The downside is it adds some friction to the editorial process at best. At worst it can get downright complicated and tedious. Additionally, gives some editors vastly increased powers of composition, and that can cut both ways – increased levels of training and support are usually necessary.

Evaluation Questions

Is it possible to structure rich text through stacked or embedded elements? Is this the only option, or does this exist alongside traditional rich text (HTML-based) editing?
Are stacked or embedded elements managed content objects, or are they specific to rich text structuring?
Can stacked or embedded elements be used in multiple rich text attributes across multiple objects, or do they exist only in the editorial space of the content for which they were created?
If these elements can be used across multiple content objects, are they subject to dependency management or referential integrity checking?