Guide to GitHub Webhooks Features and Best Practices
Now that we understand GitHub webhooks and have worked with them in our development environment, it's time to explore the real work which takes place in production environments.
In production environments we are not in total control of the situations our GitHub webhooks will experience, the way we are in development environments.There are issues of traffic spikes, security, resilience and so on that arise in production environments. As engineers, we have to make sure we design our webhooks infrastructure to withstand these issues to avoid a breakdown of workflows in production.
In this piece, we will take a look at best practices you should be following when deploying your GitHub webhooks to production.
GitHub webhooks features
Features | Notes |
---|---|
Webhook Configuration | Admin Interface or REST API |
Request Method | POST |
Hashing Algorithm | SHA-256 |
Webhook Timeout | 10 seconds |
Retry Logic | No automatic retry |
Manual Retry | Admin Interface and REST Deliveries API |
Browsable Log | Admin Interface |
Alerting | No alerts are sent on failure |
GitHub webhooks best practices checklist
Webhook Security using HMAC signatures
Webhook URL endpoints are publicly accessible and exposed for any client to send HTTP requests to. You want to make sure that you're only acknowledging and processing webhooks that come from GitHub to prevent attackers from using your open endpoint to launch attacks like SQL injection attacks.
GitHub allows you to set an API secret when creating your webhook, as shown below.
GitHub will use this secret to encrypt the webhook payload into a signature and put it in the X-Hub-Signature-256
header sent along with your webhook request.
You can then cryptographically verify this signature on your API using the secret you provided. The verification will fail if the payload has been tampered with before hitting your webhook URL, or if it did not originate from GitHub.
For more information about implementing this check, you can check the official GitHub documentation or read our GitHub tutorial where we implement the HMAC signature for GitHub webhook security.
Asynchronous processing
GitHub has a 10-second timeout for each webhook sent, which means you want to make sure you have returned a response before this time limit elapses. If traffic spikes and you don't respond quickly, you begin to accumulate a backlog of webhooks and may start dropping some.
It is advisable to avoid blocking operations when processing your webhooks, and to make your webhook processing operation asynchronous. One way to achieve this is by using a message queue.
With a message queue, you can quickly ingest your GitHub webhooks and return a response almost immediately. This also helps you scale easily with an increasing amount of subscribed-to webhooks.
You can build a message queue component using open source libraries like RabbitMQ or Apache Kafka. Building this from scratch will involve technical know-how that may not come cheap. If you want to quickly set one up, it is advisable to use a specialized webhook queue service like Hookdeck.
Rate limiting
One of the components you want to include in your asynchronous message queue setup is a rate limiter. This is most useful when you are not considering horizontal scaling yet to handle increasing traffic.
With a rate limiter, you can control the rate at which webhook requests are delivered to your API. This is to ensure that your API doesn't get overloaded with requests and exhaust its server's resources.
Servers shut down when they run out of resources like memory, and this halts the processing of webhooks. This is not a situation you want to run into in production, which is why a rate limiter comes in really handy to avoid server downtime.
Webhooks server IP "Allow" list
This is another security check to ensure that only webhooks from GitHub are honored by your API's server. By creating an "allow" list of IPs that can send requests to your API, your server blocks requests from unknown IPs, thus avoiding any spoofed requests.
GitHub publishes the list of IPs it uses for sending webhooks on its meta
API at https://api.github.com/meta. Ensure to always check this list to update the "allow" list on your API.
This is a task you can automate with a cron script that programmatically pulls the list from GitHub's API at timed intervals.
Webhook retry
Sometimes, webhooks fail. This happens when an error is encountered on your API; this could be a 404
"Not found" error, 401
"Unauthorized" error, or most commonly a 5xx
server error.
In such cases, you will need the webhook to be retried when the situation causing the error is resolved. These retries can be automatic, which helps your webhooks to self-heal when errors are cleared. You can also manually retry your webhooks; this is mostly inefficient but there are situations where a manual retry is required.
Retry systems respond to the status code returned by your API. Therefore, it is very important and also a best practice to return appropriate HTTP status codes for errors on your API.
Retry systems can be simple cron jobs, but the most effective ones keep a repository of failed events to be retried with their payload and also use intelligent timers to trigger the retries.
Webhook idempotency using delivery headers
This is very important to ensure that you're not processing a webhook more than once. Duplication of webhook processing can result in data integrity issues in your application, so you want to make sure you prevent this from happening.
By logging and keeping track of your webhook request deliveries using the x-github-delivery
header, you can detect if a webhook has already been processed and skip it in cases where GitHub resends it.
Troubleshooting before going to Production
Before you proceed to the production environment, ensure that you have properly tested and troubleshot your GitHub webhooks. Debugging in production is no fun, so you want to make sure that all logic-based errors are fixed and you're building resilience into your webhooks before shipping them to your production servers.
The benefits of troubleshooting your webhooks for errors before moving to production are most obvious when you have a low amount of issues to deal with in production.
Conclusion
The resilience of GitHub webhooks in production environments will ensure that they are fulfilling their tasks with less friction. To accomplish this resilience, following the best practices highlighted above as well as general engineering best practices should be of utmost priority to you and your team.
For more information on webhooks best practices, you can check out the article on webhooks resilience, webhooks security checklist, and troubleshooting GitHub webhooks.
A good number of these best practices can take a significant amount of time and expertise to set up properly. If you want to build fast and ensure that you're taking into consideration best practices for your webhooks, give Hookdeck a try today.
Hookdeck has in its suite tools like a retry configuration system, a rate limiter, queueing service for asynchronous processing, and an intuitive dashboard that provides clear visibility into your GitHub webhooks.
Happy coding!