Database / XML Hybrid Content Management

By Deane Barker 2 min read
AI Summary

This post explores the integration of databases and XML in content management systems, highlighting the advantages of a hybrid approach. The author discusses how this combination enhances flexibility, data organization, and publishing efficiency while addressing common challenges faced in traditional content management methods.

Thoughts on Content Management: This guy and I think alike. In the beginning of the article he touches on the same things I talked about when I compared open and closed content management systems.

Then, he runs into the same problem: there are too many types of content, each storing their own stuff. So…he comes up with the same solution I’ve tried.

Treat every piece of content as the same as every other, and store it all in a single table. Preposterous? Probably. But bear in mind that there will be a common set of metadata attributes that every piece of content will have (at least in this context): a unique name or identifier (the guid), a date it was created, a title, a description. And of course, there would have to be a “body” field for the content itself. Roll those into the table structure.

The theory is to have separate fields for the common metadata, then a field for the actual content, stored in XML. I’ve done a quick-and-dirty variation of this with a table that just had a bunch of columns: field1, field 2, field3, etc. with a table that tracked what each field stored for each type of content. Yes, it’s kludge, but it worked really well.

The XML solution is much cleaner. However, the problem is that the XML field is a black box that – on most database platforms – you can’t look inside. What if you want a list of articles written by a particular author? Well, you need to use SQL to get all the XML back, spin that collection, XPath into every single one to find the value author of the author node, then keep that record it matches.

In my mind, this won’t be a totally viable solution until databases that allow XPath queries on fields are commonplace. The ability to query like this would break this whole puzzle wide open.

SELECT * FROM articles WHERE XPath(XML, /article/author) = 'bob'

Right now, I think that Oracle and SQL Server (2000 and up) are the only database platforms that allow this. It’ll be a happy, happy day when it makes it in to MySQL.

In thinking about this, I’m reminded of my post from several months ago about managing XML files in aggregate. This is just another attempt to solve that problem: use a database and the ability to do SQL/XPath queries to manage all the individual XML documents. I guess I’m not sure if we’ve solved the problem yet.

And, now that I think back, Documentum Web Publisher does this same thing: stores all the non-meta content in an XML file. Even with that, it was tough to really slice and dice the content due to the limitations described above.

Links to this – Data Globbing with MySQL Regex July 13, 2004
As I become a more experienced developer, I’m learning when you should and shouldn’t break the rules. While following every rule of programming and data modeling is wonderful, sometimes you need to bend the rules for the sake of simplicity and expediency. Always remember, an app in the hand is...
Links from this – Open and Closed Content Management June 20, 2003
Different CMS allow you to define your content in different ways.
Links from this – Managing Multiple XML Documents May 26, 2003
One of the continuing quandaries I’ve had with XML is the management of multiple XML documents. If I have one, big XML document, then it’s easy to work with – to parse with an API, to transform with XSLT, to query with XPath. But what if I have many documents? For instance, what if I have all my...