QA Process Notes

I think we should consider the QA process from a “threat” perspective, from most coarse to most finely grained. What are the biggest, most obvious threats? Then we work down to the more granular threats.

In writing this, I realized there are some terms not easily defined –

  • Launch. I discuss this below, but it’s not always clear-cut
  • Production. Anything exposed to humans outside the client organization’s work group (in the case of a public website), or editors (in the case of editor-facing CMS functionality)
  • Error, Public or Private. An error is any abnormal or unexpected condition. A private error is only displayed to people in the client’s organization’s work group – meaning the people we interact with at the client. A public error is displayed to people outside the client organization’s work group.
  • Error, Absolute or Partial. An absolute error is one in which the entire purpose of accessing the work product is obviously and completely non-functional – an uncaught exception on a public URL, for example. A partial error is one in which something is non-functional, but the surrounding aspects of the experience might otherwise be intact – an image carousel that does not rotate, for example, but doesn’t completely break the page and wouldn’t be immediately noticeable.
  • Error, Contained or Uncontained. An uncontained error is raw – the “yellow screen of death” on .NET, for example. Or raw 404 page from the web server. A contained error is a “Whoops, something went wrong” page, or a planned 404 page.
  • Work Product. This comprises any code or functionality we have provided to the client for incorporation in their production environment. In many cases, this will be a complete website. In other cases, this will be code meant to change the functionality or appearance of a running website.
  • Expected Load. Many of the threats below are somewhat impacted by load. A site that does well in testing, may break down under load. All threats should be assumed to be under “expected load” for that particular situation, which means we need to (1) define some concept of “general load” based on our existing clients, and (2) perform at least a cursory load-test of the work product under this load.

So, the threats to our clients are…

(Note: some threats are “meta,” in that they would prevent a QA process from executing to catch errors in the first place. So they are threats, just not a threat that a QA process could catch, because a meta-threat prevents the QA process from existing or executing.)

Meta: No QA Process Exists

This is the biggest threat. That we don’t have a QA process at all. Clearly, we’re addressing this.

Meta: The QA Process Exists, But Is Not Executed

The next biggest problem is that we have a QA process, but it’s ignored for some reason. So, a launch happens without going through QA, or QA is poorly training and doesn’t do what they’re supposed to do.

The trick here is setting up an effective “gateway” for launches – how do we make sure a process happens for every launch? Furthermore, how do we define “launch”? There’s a temptation to say, “Before something becomes public,” but that doesn’t always work, because we might release code to the client, and they launch it.

So, I think a more effective definition of “launch” is, “when code or functionality leaves Blend’s control.” A launch is (1) when work product becomes public, or (2) when we lose control of the ability to modify work product.

I think we need to establish for each project what “launch” constitutes and where QA fits in. Perhaps at the kick-off meeting? So, QA is present in that meeting and the team agrees (1) what constitutes a “launch” for this specific project, and (2) by extension, where and when QA will be executed for this project.

We’ll call this agreement, “The QA Plan.”

Meta: The Work Product Does Not Launch

This threat means that there was a breakdown in process which prevented work product which was otherwise acceptable from launching when expected.

This is unlike other QA failures in that something doesn’t break, it just never gets out the door at all, due to a breakdown in process. So, is this “QA” at all? I’m inclined to say no, because problems such as these would not be under QA’s control. A tester cannot control whether or not we have DNS information, for example.

I feel like this is the domain of the PMs and the developer responsible for launching the work product.

The Work Product Exposes a Security Hole

This means the work product exposes something like SQL Injection or Cross Site Scripting. I’m calling this the worst of the problems, because

I’d rather a site throw an uncontained except than open a security hole. The existence of an exploitable attack surface opens us up to liability, and can lead to much worse than a simple uncontained error.

The Work Product is Exhibiting an (1) Absolute, (2) Uncontained, (3) Public Error

Clearly, this is bad.

In the example of a new website, this is a situation where there is a raw, naked error being displayed to the public. By “error,” I mean there is a situation that is obviously abnormal to a reasonable human being. A 404, for example. Or a 500.

This is the most basic of QA issues. If we can’t catch these, we’re nowhere.

I feel like this could be handled with a simple web crawl. There are myriad services that will crawl all the URLs on a website and produce a list of status codes. We’d look for anything that’s a 404 or 500 or, really, anything outside of 2xx.

The Work Product is Exhibiting a Contained Public Error

This is as above, but the error has been contained with appropriate messaging. It’s not “raw.”

The Work Product is Exhibiting an (1) Absolute, (2) Private Error (Contained or Uncontained)

In this situation, there’s a problem, but it’s not apparent to the public. However editors or other members of the client work are aware of the error and it is impeding their work process.

The Work Product is Exhibiting a Partial Public or Partial Private Error, or a Contained Public or Contained Private Error

A lesser condition of the above. In general:

This speaks to the fact that not all errors are created equally. A contained, partial, private error, for example, might be something small on the editorial interface that is technically an error, but isn’t a big deal and isn’t bothering anything. Errors like this might not ever be fixed, by choice.

The Work Product is Poorly Optimized for Human Consumption

This means that some work product is performing poorly for basic consumption. It has poor SEO characteristics – no TITLE tag for instance, or crappy markup in general. The most basic point of content is to be consumed. If we fail at providing something easily consumable, then we’ve failed at the most basic thing.

The Work Product is Poorly Optimized for Machine Consumption

This means that the work product is not well-suited to be crawled by search engine spiders, or other processes that rely on markup to consume. This includes basic META tags, and use of structured content microformats.

The Work Product is Poorly Optimized for Usability Under Expected Usage Patterns

This means that some work product is poorly optimized for anything outside of basic consumption, using expected usage patterns. For instance, a form has poor TABINDEX values, or there are accessibility issues that prevent people from interacting with the page easily.

For editors, the editing interface might be poorly optimized – field labels are wrong, or menu options, or there’s no help text. These are things that would confuse an editor and make it hard for them to complete their work process.

The Work Product Exhibits Errors Under Outlying Usage Patterns

This means that the work product is easily broken or confused by outlying usages – when someone is “trying to break stuff.”

The Work Product is Poorly Designed, Visually

This means that the work product just doesn’t reach a level of visual polish expected by the client, or by the public in general. I wouldn’t expect anything like this out of our design group, but if a developer “wings it” on something, I could see this happening.

This is pretty subjective. A tester might think, “this works fine, but it looks like crap.” That’s a valid observation. Since this would likely only happen when a designer wasn’t involved, perhaps we have a senior designer (Sam?) available to pass judgment on visual work product when requested. So, if a tester was suspicious of something, he could enlist Sam’s opinion on whether or not something does, in fact, look like crap.