How to Choose a Solution for Queuing Your Webhooks

The previous article detailed how webhooks are events wrapped in an HTTP request, and helped us understand why webhooks should be approached like regular events and handled using the asynchronous processing approach of the event-driven architecture.

In this article, we will discuss the features to look out for in an ideal webhook infrastructure solution that helps implement the event-driven approach to handling webhooks.

What to consider in a queuing solution for webhooks

Webhook infrastructure core features

Feature	Description
Scalability	The solution must be able to handle and adjust to varying webhook request loads. Performance should remain the same regardless of the number of webhooks it needs to process.
Performance	The solution must function at an optimal level. For example, the response time for each webhook must not exceed the timeout limit, and its SLA on response times has to be in the 99th percentile. It must also efficiently make use of available resources.
Fault tolerance	The solution must be able to continue receiving and processing webhooks despite the failure of one of its components. Redundancy and fail-over mechanisms should be built-in.
Availability	The system should always be available to ingest webhooks. A 99.999% availability (0.36 seconds allowable downtime per hour) is ideal.
Recoverability	While faults and failures remain inevitable, the solution should be able to revert to a stable state in the event of a malfunction.
Monitoring and Observability	The solution should be instrumented to report practical health, availability, and performance metrics. It should give a snapshot of the current state of the application and provide visibility into the operational patterns of the solution, and its components, over time.

Features that improve your experience

These are additional features you would be super happy to have. Each one helps improve your experience, business, cost considerations, onboarding, and more.

Feature	Description
Ease of use	The simplicity of the solution in relation to developer experience. This includes the ease of setting up and managing the solution while in operation.
Well-documented	Having enough information about how the solution works in an easily consumable format. Up-to-date with practical examples and guides on configuring it to achieve set goals.
Security	A solution that checks all the boxes in our security checklist to protect you against webhook vulnerabilities. It should also allow you to set up authentication for your webhooks.
Configurability	The end-users can easily change aspects of the software’s configuration (through usable interfaces).

All the features described above that improve your experience lead to one thing: quick deployments and record time spent on webhook management, ultimately resulting in the quickest time to value on your integrations.

Rebuilding the wheel and how it affects time to value

The features above might look like too much to ask for from a queuing solution, so let’s look at how teams have been solving this problem with existing solutions. It begins with picking an existing queueing solution, where there are two options: open-source or managed service.

The open-source approach

Going in the open-source direction involves selecting one of the following as your queue technology:

One significant advantage to using open-source queuing technologies is the flexibility it gives when setting up to configure it to suit your use case. However, deploying and scaling the solution rests solely on you.

Using managed queuing services

The second option is to use a managed queuing service. These services include:

The most significant advantage of managed queuing services is the abstraction of setup and deployment. They also have utilities for scaling your queues and provide customization options.

For a detailed breakdown on webhook infrastructure components and the choice between open-source and managed services, check out our article “Comparing Open Source and Cloud Services When Building a Webhook Management Infrastructure".

Features required to build an existing queuing solution into a standard webhook infrastructure

While existing queuing solutions come with a suite of features out of the box, you need to ensure that the following components are available before they can function as a standard infrastructure for processing your webhooks. This standard is based on the core features we discussed earlier.

Component	Function
Alerting and Logging	For capturing webhook event data and alerting on actionable events.
Ingestion runtime	For converting webhook HTTP requests to queue messages, authenticating, and performing any necessary pre-processing. Custom applications or technologies like Cloudflare Workers, Lambda functions, etc., are used to achieve this.
Consumer runtime	For consuming and processing messages from the queue. Custom applications or technologies like Cloudflare Workers, Lambda functions, etc., are used to achieve this.
Monitoring	Metrics collection and log processing into structured data analyzed for practical insights. Includes a visualization dashboard for studying the webhook operational information. Tools like Grafana, ELK stack, Sentry, etc., have been used to achieve this.
Custom scripts	These are necessary for functions such as webhook retries, dead lettering recovery, rate limiting, and troubleshooting.
Developer tooling	Running tests in production environments is not a recommended practice. Developers need tools that they can use to interact with the queue and test potential deployments and fixes from a safe (development) environment. While teams have used tools like ngrok and Postman to achieve this, it is more effective to have a tool built specifically for working with webhooks in development environments.

How existing solutions compare with the ideal solution

Rebuilding the wheel is challenging whether you use open-source technologies or managed services.

Feature	Open-source	Managed service
Scalability	You are responsible for deploying and scaling your application.	Scalability is based on offerings. Some offer manual scaling through the creation of replicas or increasing resources, while others provide auto-scaling services.
Performance	You are responsible for designing the architecture to give the best possible performance. Your architectural proficiency is proportional to the performance you get.	Performance is based on SLA for each service offering.
Fault tolerance	You’re responsible for designing fault-tolerant features like webhook retries, self-healing, failovers, etc.	It may or may not come with fault-tolerance features. Features may also be limited.
Availability	Your hosting capabilities and performance architecture determine this.	Based on SLA for service offering.
Recoverability	You’re responsible for keeping your application state consistent after failures.	You’re responsible for keeping your application state consistent after failures.
Monitoring and Observability	Limited monitoring and observability features. Generic metrics, not webhook-specific.	Limited monitoring and observability features. Generic metrics, not webhook-specific.
Time to value	Has the longest time to value.	Depends on the setup steps and requirements of the vendor.
Ease of use	Steep learning curve.	Easier than open-source.
Documentation	Depends on maintainers (mostly well-documented).	Well-documented for generic use cases (not webhook-specific).
Security	You’re responsible for designing the security features.	Aside from vendor authentication, you’re responsible for any webhook security-related features.
Configurability	Highly configurable.	Limited configuration.

This is primarily because these queues are not built with webhooks in mind. They are made for generic queuing use cases, which leaves the responsibility of aligning them to your use case to the developer.

However, if all these solutions can still be aligned to webhook processing by building out the required features discussed in the previous sections, imagine how amazing it would be to have a queuing system built just for webhooks.

Well, you don't have to imagine for long, as we introduce Hookdeck in the next section.

Introducing Hookdeck: A queuing solution with resiliency and observability built in

Hookdeck brings plug-and-play asynchronous processing to webhooks by abstracting all the work involved in creating a standard queuing solution for webhooks.

Hookdeck achieves this by looking at the problem from the developer's perspective. It doesn't just provide the core features for resiliency and observability, but you also get all the "would be nice" features that yield a great developer experience and a quicker time to value.

When it comes to processing webhooks "the right way”, scaling, and maintaining excellent performance, Hookdeck is all you need. With just one solution, you get the all the features of the ideal solution listed above.

Feature	Hookdeck
Scalability	Hookdeck takes care of scaling your integrations. You don’t need to worry about spikes or growing webhook volume.
Performance	Hookdeck ensures that none of your webhooks are dropped by making sure that the provider’s response time limit is not exceeded and performance is not degraded regardless of the volume of webhooks you’re receiving.
Fault tolerance	Hookdeck comes with manual and automatic retries built-in. You can also retry failed webhooks in bulk.
Availability	Our queues are adequately scaled horizontally and across multiple availability zones.
Recoverability	As long as we have received the webhook, you will always have the information captured and be able to return your system to a consistent state in the event of failure.
Monitoring and Observability	Hookdeck provides webhook-tailored observability that helps you see event traces, search webhooks, inspect webhooks and find any required information easily.
Time to value	Hookdeck’s webhook-inspired features enable you to have the shortest time-to-value possible.
Ease of use	Our setup is very easy to use, even for non-developers.
Documentation	Hookdeck is clearly and thoroughly documented, and the documentation is structured to help you find what you’re looking for quickly and easily.
Security	Hookdeck is built with security best practices in mind and also comes with authentication features that enable you to set up authentication between your webhook provider(s) and consumer(s).
Configurability	Hookdeck is highly configurable to suit the workflow of your webhooks.

Missing a webhook is now a thing of the past. With Hookdeck you get:

Unified workflow (even in development environments)
A centralized webhook hub to control all your webhooks in one place
Complete visibility over your webhooks
The ability to gracefully recover from all errors
…and lot's more

All these advantages result in the fast deployment of webhook integrations.

You can stop sinking time into building a generic webhook infrastructure and focus instead on solving the most important problems your business deals with. The next article in the series goes deeper into how Hookdeck abstracts all the burdens of developing, deploying, and maintaining a webhook infrastructure.

If you want a detailed breakdown of how Hookdeck compares to other queuing solutions (open-source or managed), check out some of our comparison articles.

Conclusion

In this article we went over the features to look out for, including some that would make our setup especially ideal, when choosing a queuing solution for resilience and observability of your webhooks. I’ve also detailed the options available for building out this solution and the corresponding challenges.

Finally, we introduced Hookdeck, the queueing solution built for webhooks with developers in mind. In the following article, we dive deeper into the features that Hookdeck has put in place to ensure that webhooks are no longer a burden to ingest, scale, and manage.

Happy coding!

Webhook Gateways Fetch Before Process Pattern