# Delivery Groups: Solving noisy neighbors in multi-tenant webhook delivery

Webhook traffic can be bursty and uneven. In any system that processes webhooks for multiple tenants (multi-store eCommerce apps, payment integrations, SaaS or fintech connectors), one tenant's burst becomes everyone's backlog.

We saw this clearly with a recent customer. They operate webhook delivery for hundreds of Shopify stores. One store sent 30 million events in five minutes. As a result, every other store's order, fulfillment, and inventory webhooks sat in the queue for hours.

You've likely seen something like this yourself. A flash sale that spikes one store, a cron job firing off a million updates, or a customer's integration going haywire. Now you're getting paged because everyone's webhooks are late, and you don't even know which tenant started it.

Until now, there have been three options when this happens:

1. Wait it out. Every other tenant absorbs the delay.
2. Kill the noisy tenant. You still need a one-off backfill later.
3. Build per-tenant queueing yourself, on top of cloud primitives.

None of these are good answers, especially for webhooks. SQS Fair Queues, Inngest concurrency keys, and Temporal fairness keys fair-share work inside a queue or job system you design. SQS needs tenant IDs on messages, consumers, DLQs, and metrics to find the noisy group. Inngest and Temporal need per-tenant keys on functions or workflows, and a path to send webhooks there. When the backlog is webhooks waiting to reach your API, you need fairness on that delivery path.

## What we're building

[Delivery Groups](/delivery-groups) splits a destination's delivery by a key extracted from each event, so every tenant gets its own delivery rate. With [Event Gateway](/event-gateway), you can already set a rate limit for each destination. Delivery Groups extends that to per-tenant control.

### Solve the noisy neighbor problem

A managed answer to the noisy neighbor problem, built for webhook delivery.

The feature extends the destination delivery policy. You extract a key from your webhook payload (`shop_id`, `account_id`, `tenant`, whatever maps to "tenant" in your domain) and each unique value of that key becomes its own delivery group with its own delivery rate.

When events come in faster than the destination can absorb, Event Gateway rotates between groups so that a spike in events for one tenant doesn't starve the others. So when store A sends 1,000 events/sec and store B is only sending 10/sec, store B's events don't wait for store A's backlog to clear. Each gets delivered at its own rate. Active groups take turns, idle groups don't reserve capacity, and you don't need to change anything about your destinations.

Here's how it works:

```mermaid
flowchart LR
  A[Event arrives] --> B[Evaluate key path]
  B --> C{Key resolved?}
  C -->|Yes| D[Persist delivery_group_key]
  C -->|No| M[Group keyless events]
  D --> E[Enqueue into delivery group]
  M --> E

```

You can set a key from any part of the webhook payload: `body`, `headers`, `query`, or `path` using dot notation. If your tenant identifier varies between webhook sources, you can normalize it with a transform rule before grouping. You can also set per-group rate overrides, to boost VIP tenants or throttle known-noisy ones, without affecting others.

![Delivery Groups config](./images/delivery-groups-config.png)

All it takes to set up is a few clicks in the dashboard UI, or by setting a destination delivery policy in `--config` via the CLI:

```sh
hookdeck gateway destination upsert shopify-handler \
  --type HTTP \
  --url https://api.example.com/shopify \
  --config '{
    "delivery_policy": {
      "key": "body.shop_id",
      "rate": 100,
      "rate_period": "second"
    }
  }'

```

### Stay within downstream API rate limits

If your webhook handler calls an API on every event, the [fetch-before-process pattern](/webhooks/guides/webhooks-fetch-before-process-pattern) is the common case: each webhook triggers a downstream request. Set the group rate to match that API's limit and you'll stay under it automatically. For example, set the group rate to 2 events/sec keyed on `body.shop_id` to stay within the [Shopify Admin API's per-store limit](https://shopify.dev/docs/api/admin-rest/usage/rate-limits) (2 requests/sec is the REST Admin API leak rate on standard plans; Plus gets 10x). Previously you'd create a separate destination per identifier with a [filter](/docs/filters), but that falls apart when you don't know how many unique identifiers exist, or there are thousands of them. One Delivery Group sets a single rate across every unique value of the key.

Delivery group keys are stored on every event, which adds new capabilities across the API and dashboard:

* Query, filter, or retry events by group. For example, you can retry a tenant's failed events without touching others.
* Delivery Group observability. Per-tenant observability: queue depth, new/delivered events, throughput, and backlog age at a glance.
* Group-level backpressure notifications. Get alerted in Slack, Teams, PagerDuty or via email when any tenant's queue degrades.

![Delivery Groups metrics](./images/delivery-groups-metrics.png)

With Delivery Groups in place, that 30M-event spike looks very different. A ping on Slack alerts you to "Backpressure on prod-shopify tenant acme-flash-sale." You open the Delivery Groups tab to see acme-flash-sale with 28 million events pending but delivering steadily at its per-store rate cap. All other stores on the destination are at normal queue depth, delivering on time. A temporary override bumps acme-flash-sale's rate while their backlog drains, and you get on with your day.

## Now in early access

Multi-tenant fairness is a hard problem to get right. We'd love your input to refine our approach. So today we're opening early access. [Sign up for early access](/delivery-groups) and get:

* Preview access to the feature
* Direct input on the configuration model, dashboard, and metrics
* A direct line to the engineers building it