How to Choose a Solution for Queuing Your Webhooks

The previous article detailed how webhooks are events wrapped in an HTTP request, and helped us understand why webhooks should be approached like regular events and handled using the asynchronous processing approach of the event-driven architecture.

In this article, we will discuss the features to look out for in an ideal webhook infrastructure solution that helps implement the event-driven approach to handling webhooks.

What to consider in a queuing solution for webhooks

Webhook infrastructure core features

FeatureDescription
ScalabilityThe solution must be able to handle and adjust to varying webhook request loads. Performance should remain the same regardless of the number of webhooks it needs to process.
PerformanceThe solution must function at an optimal level. For example, the response time for each webhook must not exceed the timeout limit, and its SLA on response times has to be in the 99th percentile. It must also efficiently make use of available resources.
Fault toleranceThe solution must be able to continue receiving and processing webhooks despite the failure of one of its components. Redundancy and fail-over mechanisms should be built-in.
AvailabilityThe system should always be available to ingest webhooks. A 99.99% availability (0.36 seconds allowable downtime per hour) is ideal.
RecoverabilityWhile faults and failures remain inevitable, the solution should be able to revert to a stable state in the event of a malfunction.
Monitoring and ObservabilityThe solution should be instrumented to report practical health, availability, and performance metrics. It should give a snapshot of the current state of the application and provide visibility into the operational patterns of the solution, and its components, over time.

Features that improve your experience

These are additional features you would be super happy to have. Each one helps improve your experience, business, cost considerations, onboarding, and more.

FeatureDescription
Ease of useThe simplicity of the solution in relation to developer experience. This includes the ease of setting up and managing the solution while in operation.
Well-documentedHaving enough information about how the solution works in an easily consumable format. Up-to-date with practical examples and guides on configuring it to achieve set goals.
SecurityA solution that checks all the boxes in our security checklist to protect you against webhook vulnerabilities. It should also allow you to set up authentication for your webhooks.
ConfigurabilityThe end-users can easily change aspects of the software’s configuration (through usable interfaces).

All the features described above that improve your experience lead to one thing: quick deployments and record time spent on webhook management, ultimately resulting in the quickest time to value on your integrations.

Rebuilding the wheel and how it affects time to value

The features above might look like too much to ask for from a queuing solution, so let’s look at how teams have been solving this problem with existing solutions. It begins with picking an existing queueing solution, where there are two options: open-source or managed service.

The open-source approach

Going in the open-source direction involves selecting one of the following as your queue technology:

One significant advantage to using open-source queuing technologies is the flexibility it gives when setting up to configure it to suit your use case. However, deploying and scaling the solution rests solely on you.

Using managed queuing services

The second option is to use a managed queuing service. These services include:

The most significant advantage of managed queuing services is the abstraction of setup and deployment. They also have utilities for scaling your queues and provide customization options.

For a detailed breakdown on webhook infrastructure components and the choice between open-source and managed services, check out our article “Comparing Open Source and Cloud Services When Building a Webhook Management Infrastructure".

Features required to build an existing queuing solution into a standard webhook infrastructure

While existing queuing solutions come with a suite of features out of the box, you need to ensure that the following components are available before they can function as a standard infrastructure for processing your webhooks. This standard is based on the core features we discussed earlier.

ComponentFunction
Alerting and LoggingFor capturing webhook event data and alerting on actionable events.
Ingestion runtimeFor converting webhook HTTP requests to queue messages, authenticating, and performing any necessary pre-processing. Custom applications or technologies like Cloudflare WorkersLambda functions, etc., are used to achieve this.
Consumer runtimeFor consuming and processing messages from the queue. Custom applications or technologies like Cloudflare WorkersLambda functions, etc., are used to achieve this.
MonitoringMetrics collection and log processing into structured data analyzed for practical insights. Includes a visualization dashboard for studying the webhook operational information. Tools like GrafanaELK stack, Sentry, etc., have been used to achieve this.
Custom scriptsThese are necessary for functions such as webhook retries, dead lettering recovery, rate limiting, and troubleshooting.
Developer toolingRunning tests in production environments is not a recommended practice. Developers need tools that they can use to interact with the queue and test potential deployments and fixes from a safe (development) environment. While teams have used tools like ngrok and Postman to achieve this, it is more effective to have a tool built specifically for working with webhooks in development environments.

How existing solutions compare with the ideal solution

Rebuilding the wheel is challenging whether you use open-source technologies or managed services.

FeatureOpen-sourceManaged service
ScalabilityYou are responsible for deploying and scaling your application.Scalability is based on offerings. Some offer manual scaling through the creation of replicas or increasing resources, while others provide auto-scaling services.
PerformanceYou are responsible for designing the architecture to give the best possible performance. Your architectural proficiency is proportional to the performance you get.Performance is based on SLA for each service offering.
Fault toleranceYou’re responsible for designing fault-tolerant features like webhook retries, self-healing, failovers, etc.It may or may not come with fault-tolerance features. Features may also be limited.
AvailabilityYour hosting capabilities and performance architecture determine this.Based on SLA for service offering.
RecoverabilityYou’re responsible for keeping your application state consistent after failures.You’re responsible for keeping your application state consistent after failures.
Monitoring and ObservabilityLimited monitoring and observability features. Generic metrics, not webhook-specific.Limited monitoring and observability features. Generic metrics, not webhook-specific.
Time to valueHas the longest time to value.Depends on the setup steps and requirements of the vendor.
Ease of useSteep learning curve.Easier than open-source.
DocumentationDepends on maintainers (mostly well-documented).Well-documented for generic use cases (not webhook-specific).
SecurityYou’re responsible for designing the security features.Aside from vendor authentication, you’re responsible for any webhook security-related features.
ConfigurabilityHighly configurable.Limited configuration.

This is primarily because these queues are not built with webhooks in mind. They are made for generic queuing use cases, which leaves the responsibility of aligning them to your use case to the developer.

However, if all these solutions can still be aligned to webhook processing by building out the required features discussed in the previous sections, imagine how amazing it would be to have a queuing system built just for webhooks.

Well, you don't have to imagine for long, as we introduce Hookdeck in the next section.

Introducing Hookdeck: A queuing solution with resiliency and observability built in

Hookdeck brings plug-and-play asynchronous processing to webhooks by abstracting all the work involved in creating a standard queuing solution for webhooks.

Hookdeck achieves this by looking at the problem from the developer's perspective. It doesn't just provide the core features for resiliency and observability, but you also get all the "would be nice" features that yield a great developer experience and a quicker time to value.

When it comes to processing webhooks "the right way”, scaling, and maintaining excellent performance, Hookdeck is all you need. With just one solution, you get the all the features of the ideal solution listed above.

FeatureHookdeck
ScalabilityHookdeck takes care of scaling your integrations. You don’t need to worry about spikes or growing webhook volume.
PerformanceHookdeck ensures that none of your webhooks are dropped by making sure that the provider’s response time limit is not exceeded and performance is not degraded regardless of the volume of webhooks you’re receiving.
Fault toleranceHookdeck comes with manual and automatic retries built-in. You can also retry failed webhooks in bulk.
AvailabilityOur queues are adequately scaled horizontally and across multiple availability zones.
RecoverabilityAs long as we have received the webhook, you will always have the information captured and be able to return your system to a consistent state in the event of failure.
Monitoring and ObservabilityHookdeck provides webhook-tailored observability that helps you see event traces, search webhooks, inspect webhooks and find any required information easily.
Time to valueHookdeck’s webhook-inspired features enable you to have the shortest time-to-value possible.
Ease of useOur setup is very easy to use, even for non-developers.
DocumentationHookdeck is clearly and thoroughly documented, and the documentation is structured to help you find what you’re looking for quickly and easily.
SecurityHookdeck is built with security best practices in mind and also comes with authentication features that enable you to set up authentication between your webhook provider(s) and consumer(s).
ConfigurabilityHookdeck is highly configurable to suit the workflow of your webhooks.

Missing a webhook is now a thing of the past. With Hookdeck you get:

  • Unified workflow (even in development environments)
  • A centralized webhook hub to control all your webhooks in one place
  • Complete visibility over your webhooks
  • The ability to gracefully recover from all errors
  • …and lot's more

All these advantages result in the fast deployment of webhook integrations.

You can stop sinking time into building a generic webhook infrastructure and focus instead on solving the most important problems your business deals with. The next article in the series goes deeper into how Hookdeck abstracts all the burdens of developing, deploying, and maintaining a webhook infrastructure.

If you want a detailed breakdown of how Hookdeck compares to other queuing solutions (open-source or managed), check out some of our comparison articles.

Conclusion

In this article we went over the features to look out for, including some that would make our setup especially ideal, when choosing a queuing solution for resilience and observability of your webhooks. I’ve also detailed the options available for building out this solution and the corresponding challenges.

Finally, we introduced Hookdeck, the queueing solution built for webhooks with developers in mind. In the following article, we dive deeper into the features that Hookdeck has put in place to ensure that webhooks are no longer a burden to ingest, scale, and manage.

Happy coding!