Managing Service and Content Quality Over Time

Managing a large digital estate over time requires a lot of platform, processes, and forethought.

By Deane Barker

Managing a large digital estate over time requires a lot of platform, processes, and forethought.

I’ve spent time working with young people, specifically teenage boys. Being in that stage of life, they regularly regaled me with all the cool things they were going to buy when they got old and rich – as if those two things were inextricably bound together – cars and houses and electronics and gadgets, etc.

At about this same time, I had a very successful (adult) friend who actually bought a Lamborghini. Soon, he was complaining about how he had to have it shipped to another city to get oil changes, and had to have the wrap re-done regularly (they don’t actually paint cars anymore, I guess?). I remember asking him how much it cost to insure – I don’t remember the number, but it was absurd.

This gave me a handy object lesson for my young charges –

Expensive things aren’t expensive just once. They continue to be expensive over time. And while lots of people dream about getting these things, not enough people think about more boring topics like maintaining them.

I feel like there’s a corollary with large websites. They’re not just difficult to build and accumulate – they often continue to be difficult over time. If you don’t stay on top of them, they slowly (or quickly) fall apart.

I have about 6,000 URLs of content on the website you’re reading right now. That represents so many things that could go wrong. In addition to all those URLs, I also possess a weird anal-retentivity against errors that has driven me to do some hilarious things to try and ensure quality (…for a website that makes no money and that I tinker with incessantly; I have great business sense…🙄).

Let’s think for a second about all the things that need to happen for a website to provide the value you’re hoping for –

  1. The website has to be up – it has to provide some response to a request
  2. That response has to be valid; 200 OK; not just an error page
  3. That response has to have valid HTML constructs that make sense when interpreted by a browser
  4. Functionality you have programmed – like search or taxonomy – has to work correctly
  5. The content has to be minimally valid – located in the right place visually, spelled right, organized correctly, etc.

And this all comes before any whiff of content strategy. All of this has to happen before someone can even decide if your content is any good.

Here are some things that I do. Keep in mind, this is a personal website, maintained by one person. What you might do for an enterprise property with more than one team member is likely different, but a lot of the themes are the same.

The first step down this road is simple Site Monitoring. There are services that do this – they make requests against your site so often just to make sure it’s still up. At the most basic level, they check for a response. Any service worth its cost will also check that the status code is a 200 (because a 404 response is still a response…)

Going a bit further, you can configure a lot of them detect specific HTML patterns, to make sure what’s coming back is what you expect and not just some “we’re down for maintenance” page.

Again, these exist as paid services, but they’re also wildly easy to script yourself.

Back in the early days of the internet, I used a simple VBScript for this. HTML parsing in that language was basically non-existent, so I just embedded an HTML comment as the last thing on the page. The script would make sure the HTML comment was in the response, to ensure that the page rendering didn’t error out before that point.

A long time ago, I used Nagios for this, but that system has long-since been eclipsed by others. For a particular project, I pointed it at a hidden URL that did a lot of things in the background – connected to database, tried to access specific files – any one of which would throw a nasty error if something went wrong. So one URL call effectively “ran” 20 or so tests.

I was given a free account at Visualping last year. I’ve been using that for a while and quite like it. It’s meant to detect and summarize site changes – which it does very well; it’s AI summaries of changes are eerily well-done – but it also doubles as a site monitoring tool.

So, that’s pretty simple. But, let’s go further. Say that we know the website is up and delivering a response. How do we know that response is what we want?

This is where you move into Automated Browser Testing. These are tools and services that will instantiate a “virtual browser” in memory, load your website, then interact with it as if they’re a person until they reach some state that constitutes a valid test.

For example, you could have your test bring up your search page, “type” something in the search box, “press” the submit button, and ensure some results come back. It could even comb through those results to make sure an item that you expect appears in the list.

By setting up that test, we’re “forcing” a lot of preceding functionality be available. To get a specific list of results, our search engine has to have indexed the content properly…and our search UI has to be working…and our website has to be up…etc. We’re now effectively testing the end result of a long line of dominoes to make sure something is working.

(Do you need “standard” Site Monitoring if you’re using Automated Browser Testing? Probably not, but I haven’t thought about it too hard. There are probably some cases where both would make sense.)

A lot of these services are based on an open-source tool called Selenium. The one I’ve worked with is called Ghost Inspector.

Years ago, I used Ghost Inspector while working on a large project over the course of about a year. It effectively ran about 100 spot checks on different parts of a the developing website once an hour. If a change in Thing A caused an error in Thing Z, many levels removed, we got notified. It was basically holistic unit testing. The tests were timed, but you can run them on-demand as well, from an API call, so you could run the tests as part of a build process.

(We were open with this customer about the testing suite, and we offered to transfer our Ghost Inspector account to them so they could continue using the tests after launch. They didn’t take us up on it, and I thought they were abandoning a huge amount of hard-won value.)

So, now we know the website is up, and we know that all sorts of functionality is working. But, this is just technical – it’s service quality, not really content quality. How can we be sure the website is returning what we want it to?

This is where we move into Digital Quality Management (sometimes called “Digital Governance Management” – the term is currently a little slippery).

There are an entire category of software products here. They consume your website (remember, we’re just assuming that everything is working – if it’s actually broken we would have caught it by now), and they verify and measure multiple factors:

These platforms will crawl your website regularly (or on-demand), check for compliance, then send you notifications and reports about issues. Some of them will plug into your CMS to provide pre-publish checking, but they tend to be CMS-agnostic, as content for large enterprise properties often comes from many different sources. They consume the finished product, regardless of what was used to generate it.

These platforms come pre-configured with lots of rules for the above, but what gets interesting is when you build out custom rules.

I have a homegrown DQM system running against my site (note: I do not recommend this – it’s an unholy mess of LINQpad scripts – but it just kind of grew over time). You can see the latest report here.

Here are a few things I check for –

I can allow exceptions to these rules. Some pages break a rule or two, and I can “pre-acknowledge” this to avoid it throwing errors. I’ve even drilled this down to the HTML element level – I can exempt single elements from particular rules if I need to.

There are 30+ tests I run – you can see them at the bottom of the report. Some are specific to my situation, some are related to some weird one-off problems that happened at some point (I should probably prune these…), and some are just good tests in general. Most all DQM tools will let you write your own tests to enforce your own concepts of quality and governance.

So, now we’ve ensured the site is responding, that’s it’s sending back a valid response, and the content its sending back is performant, accessible, search-friendly, and complies with whatever arbitrary rules we want to put in place.

Congratulations. We’ve achieved a minimal level of quality.

This is getting long, but there are some other practices I’ll talk about in future installments:

So, in the absence in covering those topics today, let me leave you with these three “meta issues” you’re going to have to figure out when implementing a digital quality program in any form. None of these issues are technical, but here the point to the larger issue with these programs –

The biggest threat to digital quality is that no one cares, or interest wanes over time, so all of your wonderful plans amount to nothing.

So, to prevent that –

So, there’s a lot here. I’ll just conclude by saying that this matters, because content matters. If you publish something on the Internet, please take care of it. Make sure it doesn’t suck, or start to suck over time.

The Internet can be a mess. Let’s all keep our ends of it clean.

This is item #2 in a sequence of 361 items.

You can use your left/right arrow keys to navigate