Fikayo Adepoju Oreoluwa

Hookdeck vs Kafka: Which Way to Go with Your Webhooks?

Published Mar 27, 2023 · Updated Mar 30, 2023

One of the queueing options for implementing asynchronous processing and event-driven architectures is Kafka. Kafka is a distributed, horizontally scalable, fault-tolerant queueing technology. Because of its performance, Kafka has become very popular in distributed, often complex architectures.

These strengths make Kafka a great choice when you need complete control of your distributed messaging, and you can manage its complexity; however, it may not be the best choice if you need a quick time to value on webhook integrations and management.

This article will compare Kafka with Hookdeck when it comes to adding asynchronous processing to webhooks.

First I will provide an overview of both solutions, and then guide you through how to choose the right solution for your use case. If you’d like a primer on processing webhooks asynchronously or just need to brush up your knowledge, check out this article.

What is Kafka?

I’ve mentioned that Kafka is a highly distributed and scalable queueing technology with reliable fault-tolerant features. Kafka gives you a message bus with a high message throughput of up to millions per second, and helps you redundantly handle large amounts of data.

In this section, we will look at the benefits of using Kafka for your webhooks and also point out some of Kafka’s attributes that you need to be aware of before considering it.

Why use Kafka

High throughput and low latency: Kafka is built to be highly performant. One major thing that Kafka does so well is maintain performance despite the growing volume of data. This ability keeps the response time low enough for webhooks not to exceed the timeout limit placed by providers.
Durability: Durability is at the core of Kafka’s operations. Kafka has the ability to partition topics and replicate them across multiple brokers. This redundancy helps Kafka recover your data when brokers goes down.
Scalability: Kafka’s design is distributed in nature, making it ideal for distributed environments. This attribute makes it possible for you to scale it horizontally as brokers can be grouped into clusters, more broker nodes can be added to clusters and you can have different clusters running across multiple machines.
Integration with other systems: Kafka is a very lean queueing framework, and it was made to solve one problem and solve it very well: queueing. This makes it so that Kafka is pluggable into any architecture and works with other tools such as storage systems, monitoring and visualization systems, processing framework, and other queueing systems.
Flexibility: Kafka can be used for a variety of use cases, including data ingestion, messaging, streaming data processing, and event-driven architectures. It also supports a wide range of programming languages.
Realtime data processing: Kafka enables real-time data processing by providing a distributed, fault-tolerant platform for collecting, processing, and storing streaming data.
Deterministic message ordering: Because Kafka at its core is an append-only commit log, you get message ordering out of the box. This is very useful for systems that are strict on the order in which messages are processed, which is a non-trivial problem in distributed systems.

What you should be aware of

Steep learning curve: Kafka is non-trivial to grasp and set up. You need an expert in the technology to take full advantage of its features and performance benefits
Dumb broker, smart consumer: Kafka pushes all the heavy-lifting to its consumers and producers. Tasks like knowing which webhooks have been been consumed, replaying a webhook (single or batch), etc., are handled by the consumer.
Its fault-tolerance is a trade-off with performance: Kafka achieves fault-tolerance through its ability to create and distribute replicas across multiple brokers. The more fault-tolerant your Kafka implementation is, the less performant it becomes due to the need for coordination and synchronisation of replicas for consistency and failure recovery.
You can’t modify or delete records: As an append-only commit log, you can’t modify or delete records from it. This is how Kafka maintains it’s message ordering.
You will need Kafka streams for any form of pre-processing: Kafka stores messages in a standardized binary format unmodified throughout the whole flow (producer > broker > consumer). To perform any type of transformations, like modifying the payload of your webhooks, you will need to use Kafka Streams.

Kafka is a very powerful queueing system and as we have described, it is capable of processing trillions of webhooks at optimal performance. However, it may be an overkill for working with webhooks. To learn more about why we believe Kafka might be an overkill for your use case with webhooks, check out this article.

What is Hookdeck?

Hookdeck is an infrastructure as a service system for processing webhooks. Hookdeck provides a message queue that asynchronously processes webhooks by ingesting webhook requests from your SaaS applications and distributing them to your callback endpoint based on the load your API can handle.

In this section, we will look at the benefits of using Hookdeck for your webhooks and also point out the attributes of Hookdeck that you need to be aware of before considering it.

Why use Hookdeck

Quick setup: Hookdeck can be set up to start handling webhooks reliably in a matter of minutes. The time to value on integrations is one of the fastest.
Uniform workflow for all your webhook operations: Hookdeck helps define a uniform workflow for webhooks from different SaaS applications. This removes the overhead of learning how each webhook provider operates.
Streamlined webhook management: All your webhook management functions are housed in a single dashboard. No need to jump across multiple dashboards in your stack to manage webhooks.
Webhook-tailored features: Hookdeck is built for webhooks; thus, it contains features like retry (manual and automatic), webhook delivery throttling, webhook payload transformations, and webhook trace monitoring. These features help you manage your webhooks and provide visibility into their lifecycles.
Reliable webhook infrastructure: Hookdeck replaces your entire webhook infrastructure. Its simplicity is not at the expense of the reliability standards required for processing your webhooks.
Developer experience: All the webhook management tasks in Hookdeck have been designed to require the least developer effort and time. This translates to doing more with less, ultimately saving time and energy spent on common tasks.
Ability to work with multiple sources: Hookdeck easily integrates with multiple SaaS webhook providers like Shopify, Stripe, and GitHub.

What to be aware of

Customizations: While Hookdeck integrates fully with new and existing infrastructure stacks and you can extend its functionality through the Hookdeck API, you cannot build new/custom functions into the dashboard at the moment.
Advanced monitoring: Hookdeck gives you top-to-bottom visibility into the activities of your webhooks and the data pipeline. However, if your monitoring needs are more advanced than what is currently available, you might need to pull logs from Hookdeck to set up more complex dashboards in a tool like Grafana.

Kafka for processing webhooks

Kafka Architecture

Now let’s look at the experience of using Apache Kafka for handling webhook ingestion and delivering webhooks.

Requirements

A decision on which message serialization format you’ll use (JSON is a top choice)
An understanding of the Kafka Binary Protocol
Knowledge of the programming languages supported by Kafka (Java, Scala and higher-level Kafka Streams library for Go, Python, C/C++, etc.)
Hosting for the Kafka cluster
Kafka libraries for the webhook producer (or gateway) and consumer

Setup process

Set up and host a Kafka cluster
Create Kafka topics for your webhooks
Define your partitions and replicas for your Kafka topics
Set up an API gateway to receive webhooks as HTTP requests and publish them to Kafka using a Kafka producer
Create Kafka clients to consume messages from Kafka
Optional: set up Kafka Streams for any form of processing required
Optional: set up Kafka Connect for interaction with external services like databases or APIs

Management and reporting

Kafka produces metrics that can be visualized through the Kafka management UI and can also be collected by metric collection agents. These metrics include information on the number of messages produced and consumed, the number of bytes sent and received, the latency of the requests, and more.

Kafka also generates logs that can be used for troubleshooting issues and monitoring health. Most production monitoring setups involve collecting and visualizing Kafka metrics and logs with third-party tools like Prometheus, Grafana or Datadog.

Security

Webhooks require authentication to be securely accessed. Basic auth and signature verification are two very popular authentication strategies for webhooks.

Kafka does not help you implement these. Remember the dumb broker, smart consumer principle of Kafka? Yeah, webhook authentication responsibilities are deferred to the producers and consumers when working with Kafka.

Maintenance

Configuring and running regular backups
Cluster health, performance and availability monitoring
Capacity planning iterations based on webhook volume
Cluster upgrades to stay up-to-date with releases and bug fixes (this may require downtime)
Kafka core security maintenance which includes managing certificates, configuring access control lists, and rotating keys and passwords, etc.
Performance tuning based on changing requirements and best practices
Troubleshooting and issue resolution

Hookdeck for processing webhooks

Hookdeck architecture

Requirements

An HTTPS endpoint to your backend

Setup process

Create a new connection
Name your connection (for me this was Shopify Store Hooks)
Enter destination label (for me this was My production API)
Enter destination URL (your backend https endpoint)
Deploy connection (click the Create Connection)
Replace the endpoint in Shopify with the one generated by Hookdeck after the connection has been created

Unlike the steps listed for Kafka, I have included the sub-steps here and this is all there is to it. The entire process takes about 5 minutes tops, including testing out the setup.

Management and reporting

Hookdeck has a dashboard built for managing, tracking and analyzing webhook requests. Every single bit of information regarding your webhook request is captured and accessible to you. Hookdeck also adds metadata like request timestamps, the status of your requests, and how many times the request has been attempted. The dashboard helps you make sense of all captured information by visualizing your data in a comprehensible way.

You can also set up alerts to be notified when something important happens so that you can take action promptly.

Functions such as webhook retries (single or bulk), delivery throttling, transformations, and webhook authentication are also done through the dashboard.

Hookdeck Monitor

Security

Hookdeck helps you set up authentication between your webhook providers quickly and easily.

Out of the box, Hookdeck supports signature verification and other platform-specific functionality for Twitter, GitHub, Shopify, Stripe, and more. A full list of providers, along with configuration options, lives on the Source Integrations page.

You can also implement your own authentication for any platform that supports HMAC, basic auth or API keys authentication strategies.

Maintenance

Being an IaaS, Hookdeck is fully managed by the company behind it. You don't need to worry about scaling servers, security patches, software updates, and so on. You also don't need expertise in message queues to run and maintain a fully functional one.

Verdict: Kafka or Hookdeck for your webhooks?

Now let’s zoom out and compare the two options we have been discussing so far based on the factors I have covered and more.

	Kafka	Hookdeck
Setup	Requires high proficiency in event-driven architectures and Kafka itself to set it up efficiently	Easy to set up (takes minutes)
Ease of use	Non-trivial	Abstracts all the complexities that come with managing and scaling the webhook infrastructure
Flexibility	Highly flexible, built to exist at the core of distributed architectures and integrate with other systems	Integrates seamlessly with webhook providers and server APIs for webhook consumption
Scalability	Built to be distributed and scalable horizontally	Highly scalable, fair usage limits exist
Performance	Scales up to millions of messages/second without degrading performance	Maintains performance levels with increasing load based on SLA
Customization and configurability	Highly customizable and configurable	Highly configurable, limited customization
Monitoring and logging	Generates logs and metrics, requires external tools to set up adequate monitoring	Generates logs and provides intuitive monitoring tools for monitoring the trace of your webhooks from source to destination
Ingestion	Requires an intermediary component like an API gateway to function as a Kafka producer in order to ingest webhooks	Ingests webhooks seamlessly
Alerting	Requires you to set up alerting using third-party tools	Comes bundled with alerting and other notification tools
Recoverability	When consumers fail to consume a webhook, recoverability is deferred to the consumer	Can configure automatic retries and also manually retry webhooks one by one or in bulk
Time to value	The complexity of the technology and proficiency required slows down its time to value	Has one of the quickest time to value for webhook integrations and management
Documentation	Very well documented however it is easy to get overwhelmed as its can sometimes feel like a huge reference manual	Well documented with exhaustive guides to cover many use cases

The main takeaway is that Kafka is super robust, highly performant, and can handle large amounts (trillions) of data without taking a performance hit. It is also very flexible and integrable, built to exist at the core of distributed architectures.

However, all this power comes at the price of complexity and huge setup and maintenance costs. Hookdeck abstracts all these complexities and gives you a simple interface and features tailored to the webhooks' use case. This approach provides a quick time to value for integrations and webhook management. This design may limit extreme customizations, but the benefits far outweigh the costs.

💡 Hookdeck also uses Kafka for its performance and ability to handle large amounts of data, but we abstract the complexities so you don't have to worry about it.

Conclusion

In this article, we have compared the experience of implementing asynchronous processing for our webhooks using Kafka and Hookdeck.

One thing is clear: if your infrastructure demands require complete control of the setup, hosting platform, software installations, and heavy customizations, then you should invest in rolling your message broker setup using Apache Kafka.

However, if you need to set up message queues for your webhooks quickly and efficiently, have full-fledged monitoring and alerting tools, search through webhook events and configure automatic retries for failed requests, and have built-in security tools, then Hookdeck is the right approach.

And best of all, you can start with a free Hookdeck account today.