Webhook Notes

In this document, I have endeavored to “dump” everything I know about webhooks.

This document has been compiled from my experience –

  1. – working with webhook systems in content management
  2. – proactively researching webhook systems for product evaluation and development
  3. – building multiple webhook systems and related technologies. Examples:

What’s described in this document is a broad survey of webhook styles, architectures, and options. This is not presented as a list of requirements for any particular webhook system, but rather as a catalog of all possible options. To my knowledge, no observed system implements every option presented here.

Also, the usage of webhooks in this document is biased toward content management scenarios, simply due to the majority of my experience.

Finally, in some cases, architectural concepts have been defined. These definitions are for illustration, or represent commonly accepted norms, features, or conventions.

Introduction

A “webhook” is a informal term to refer to an HTTP request made from an originating system in response an internal event. The goal is to notify and transmit data related to the event to another system.

A webhook effectively allows an originating system to notify a remote system of an event and provide it with data describing the event.

Remote procedure calls are not new (consider DCOM or gRPC as historical examples), however webhooks differ in that:

  1. They use HTTP, over the “public” Internet (hence the “web” moniker)
  2. They are configured, not coded. Webhooks are specified by users of of the original system as ad hoc methods of connecting it to other systems, unknown to the developers of the originating system.

There has never been a formal specification or even agreement on specifics, simply an informal understanding the architecture, purpose, and usage.

There has traditionally been a website at webhooks.org which served as a wiki for informal webhook information, however it appears to be offline. Here’s the last capture of the site from the Internet Archive.

Web Hooks / FrontPage

I was personally using webhooks as far back as 2003, however I called them a “pinged script” at the time. I built a plugin to Movable Type, a popular Perl-based blogging platform. In the related blog posts, I discussed the logic and reasoning behind what I was doing.

Extending Movable Type Using a Pinged Script

From that blog post (emphasis added)

There are a few cases where I want to do interesting things with entries, but I don’t want to hack into Ben’s Perl code. I solved this problem by inserting just enough code to ping a specified URL whenever an entry is saved.

The accepted term “webhook” dates to 2007 (originally “web hook”). As near as I can tell, this is the first reference to them:

Web hooks to revolutionize the web :: Jeff Lindsay

Sample Use Cases

Common Webhook Events

In content scenarios, events are usually raised in response to actions involving a stored entity, meaning a content item, a file, a folder, or a user.

Possible Alternatives to Webhooks

Let’s first specify the simplest requirements as –

  1. We want to notify a remote system of events which occur in an originating system, and pass related data
  2. We want these notifications to be specified by the end users of the system, not the developers, while that system is running in a production environment.

These requirements necessarily preclude anything that involves the source code of the system – therefore nothing like traditional gRPC or SOAP or REST can be used. The system is required to be “ignorant” of any specific webhook, so they can be added by users in the production environment.

Given that limitation, the only real alternative becomes some type of API polling. This could take two forms:

  1. A scheduled API call which catalogs all entities and diffs that catalog against the last call. This would need to detect new entities (creations) and missing entities (deletions, which would not be in the results). Edits would need to be detected using a last edit time, or perhaps a hash of the actual content.
  2. A “sync” API call. This would need to be provided by the originating system, but it’s a change or audit log of everything that has happened in that system since a specified point in time.

Unfortunately, polling presents several drawbacks.

  1. It’s inefficient. To ensure accuracy, you would effectively have to produce a catalog of every entity in the system, every time. If you could order the API results by last edit time, you might be able to cut the catalog down to the “last X items edited,” but you then run the risk of a surge of edits which exceeds the polling interval.
  2. There is latency, given that this method operates on a schedule
  3. Unless there is a sync API specifically designed for this purpose (#2, above), polling can be inaccurate. Even assuming the logic to diff the derived catalogs is valid, entities could be created and deleted between polling intervals.
  4. It requires action from the receiving system to poll the API, or it requires a third system if creating up a general polling system, independent of the target system.

Architectural Definitions

Note that some of the definitions below can be construed to describe a specific architecture. However, the intention is merely to illustrate concepts, not dictate development.

(Also, these definitions are naturally interlinked, so there’s no easy way to order them to avoid to need for look-ahead. They are, very roughly, ordered by the timing of their appearance and impact to the overall process.)

In the text following this list, defined terms will be capitalized.

Example Flow

Click to enlarge the diagram.

Webhook Factory Configuration

A Factory is a set of configuration parameters or “rules” that execute for every Event. As a result of these rules, the Factory may or may not generate a Webhook.

Trigger Configuration Options

The most common configuration options to trigger Webhook generation:

Target Configuration Options

If a Factory generates a Webhook, the follow options are commonly offered to specify the Target:

Payload Configuration Options

In the event of a POST request, a Payload can be specified which forms the body of the request.

Default / Automatic Headers

Most systems will add custom HTTP headers by default to transmit meta-information about the Factory or Webhook that generated the Attempt (ex: “X-Webhook-ID”):

Attempt/Resolution Configuration Options

If an Attempt receives a 2XX response, the Webhook resolves. In the event of any other response, retry options can usually be specified.

Some systems will have notification/alert systems for Webhook failures (though, in a high-volume system which experiences extended downtime from a Target, individual notifications would quickly become unmanageable).

Some systems will also disable the Factory itself if too many Attempts fail from Webhooks generated by that Factory. This is to prevent wasted work against a crippled or unreachable Target System. (Clearly, this would necessitate some type of notification system to alert that the Factory had been disabled.)

Webhook UI

The only “intentional user” of a webhook system is an administrator. For editorial users, there is no UI – the webhooks will generate and resolve in the background silently.

The webhook administrator needs a UI to accomplish the following:

The administration UI generally consists of these displays:

In some systems, creation or updating an existing Factory will generate a “ping” request to the Target (often as an OPTIONS method) to ensure it exists and can receive connections.

Directionality, Synchronicity, and Reactivity

Most Webhooks are intended to be asynchronous and one-way, meaning the Originating System sends an Attempt to a Target System, and does not (1) block any user-detectable process while waiting for the response, nor (2) use the response in any logical processing, other than simply logging it.

(Colloquially, webhooks are “fire and forget.”)

Two-way Webhooks, in which the Originating System operates on the response data received from the Target System, are theoretically possible, but rare, for several reasons:

A potential two-way use case might be for data validation. Upon data form submittal, a Webhook might send the submitted data to a Target and use the response to prevent saving the data and instead validation errors to the user. However, this stretches the definition of the term “webhook” and might be framed and documented as a “validation API” instead.

Webhooks are normally not guaranteed to be in any specific order, relative to the order of Events occurring in the Originating System. This would necessarily depend on how the Processor is designed to manage the Queue, but given the general vagaries of multi-threaded and transport, order consistency cannot be depended on.

Some systems allow “chaining” webhooks, meaning when Webhook A resolves, it can trigger Webhook B, etc. This is rare, and of limited utility in most cases. The chained Webhooks would usually be configured this way so that Webhook B can make use of data returned by Webhook A, thus making the webhook two-way.

However, as noted above, configuring this level of coupling (and the required edge and error handling) is beyond the scope and capabilities of most webhook users and would normally be accomplished by more programmatic methods. (ex: the Target for Webhook A is an API endpoint that then makes several HTTP requests from code.)

Webhook Security and Load Management

Any webhook necessarily involves two systems: Originating and Target. There are security considerations on both sides.

The Minimum Viable System

As noted earlier, this document is a broad survey of all features and architectures in observed webhook frameworks. As also noted, there is no formal definition of what constitutes a “webhook framework,” leaving the features subject to interpretation by an implementing system.

Based on my experience, here is the minimal featureset required to achieve a generally accepted definition of a “webhook framework.”

Though not required, a final recommendation would be that the Webhook Framework run in its own environment, retrieving Events from some type of communication bus shared with the Originating System (usually a formal queuing system, though a database table or even the file system might be used for low-volume scenarios). This architecture would insulate the Originating System from any performance issues, as its only concern would be generating the Event and sending it to the bus.

Demarcation and Support

One universal principle to webhooks is simply that the Originating System and Webhook Framework is not responsible for what happens on the Target System. The only requirement is to make a valid Attempt and record the result, whatever it might be.

This, then, becomes a clear demarcation point for support issues: if the Webhook Framework can prove that the Attempt was made with the correct/expected data, and the response was recorded, then it bears no liability for how the Target System functions.

That said, here are three points where webhook users might run into issues, and the Framework would need to prove correct functioning:

  1. The Originating System raised the event correctly.
  2. The Factories correctly interpreted the event and correctly generated a Webhook (or didn’t)
  3. An Attempt was made with the correct/expected data
  4. The response was logged

Conclusion: Managing Developer Tension

Webhooks are an extension feature. As such, they permit the users of the Originating System to extend it to do things not understood or foreseen by the originating developers.

This tends to create tension among the developers and architects of the Originating System, and it tends to move to more “platform” model, rather than a simple “product.”

One of the key sources of tension with webhooks is to what extent they should contain programming logic themselves, or to what extent they should simply be treated as notifications to other system. Put another way, are we expecting our users to become developers?

Within most webhook frameworks, there are two points of logic:

  1. Event evaluation by Factories, which sometimes includes the ability to provide code-based logic
  2. Payload specification, which sometimes includes the ability to template data received from the Event, rather than simply passing it through

With those two exceptions, webhook development should be a configuration task, not a code task. If Factories provide some type of domain-specifc-language (DSL) for trigger logic or payload specification, it should never be required, only offered as an option for more advanced requirements.

Another point of tension is whether or not webhooks should be expected to make “blind” calls to APIs, or if they should just notify other systems to make more complicated API calls?

For example, if you wanted to use a web API to create a Slack notification that new content has been published, you can approach it two ways:

  1. Directly: Configure a Factory to create Webhook that generates the API call directly, with the correct headers, body, authentication, and querystring arguments for the Slack API
  2. Indirectly: Configure a Factory to simply notify some other logic unit (an AWS Lambda function, for example), which contains specific code to communicate with the Slack API

The former option is often not realistic, given the idiosyncrasies of web APIs. To try to create a system that would allow configuration options that would adapt to any web API would be to invite complication and frustration – it would open the systems up to endless edge cases and random, requirements.

In most cases, webhooks should be treated as simple notifications, and direct communication with other systems should be written in more fully-realized code environments. Webhooks should be developed with the expectation that connections to the “terminal system” are indirect, and proxied through another environment.

A last point of tension is the introduction of the Target System. For developers of the Originating System, there’s a tendency to view the Target System as a dependency that can’t be controlled. However, as noted earlier, this can be mitigated by several principles:

  1. Completely divorcing the webhook system from any synchronous processing in the Originating System. The only synchronous operation of the Originating System should be to deposit an Event in some communication bus shared with the Webhook Framework. Beyond that, the Originating System has no further connection with the process.
  2. Clearly demarcating and limiting the responsibility of the Webhook Framework to simply making a correctly formatted Attempt and recording the results. What happens in the Target System is of no consequence to the Framework, and this should be clearly communicated to users.
  3. Setting sensible, conservative maximums and defaults on Attempt logic. Timeouts should be limited, throttles should be specified, and maximum retries should be capped.

To be clear, the existence of webhooks will change the nature and perception of the Originating System. Even if it has an existing API, it will now become a proactive generator of activity. Additionally, it introduces a pseudo-programming environment, which must be managed.

As with anything, clear communication of the intention and limitations of the system is key to ensuring it fulfills the expectations of the user.