Phil Leggetter Phil Leggetter

How Knock uses Orb webhooks for usage-based upgrade nudges

Published


Webhooks by Example is a Hookdeck series where we sit down with practitioners shipping real webhook implementations and walk through one use case end to end: the architecture, development workflow, production setup, and the gotchas that matter in practice.

In episode one, Jeff Everhart (Developer Advocate at Knock) walks through how Knock uses Orb usage webhooks to drive customer-facing upgrade nudges.

Knock is a customer engagement and notification workflows platform, and in this implementation they use Knock itself to orchestrate and send the usage-threshold messages triggered by Orb webhooks.

The use case

A customer hits 75% of their plan limit. What happens next?

You can log the event and move on. You can surface it in a billing dashboard and hope someone sees it. Or you can turn it into a customer-facing message with the right urgency at the right time:

  • 75%: Heads-up
  • 90%: Stronger warning
  • 100%: Clear "you have exceeded your limit" message with plan-specific upgrade guidance

That is the pattern Knock built. They run it for their own usage thresholds and recommend it to customers building similar usage-based workflows.

The source event: Orb subscription.usage_exceeded

Orb is a usage-based billing platform. Knock sends usage events to Orb as customers send messages, and Orb tracks those events against plan limits. When a configured threshold is crossed, Orb sends a webhook from its webhooks integration.

The key event here is subscription.usage_exceeded, which Orb emits when usage exceeds a configured alert threshold.

Knock currently triggers at 75%, 90%, and 100%.

The payload tells you what crossed a threshold (subscription, metric, percentage, limit). But it does not tell you everything needed to send a useful message:

  • Who at the account should receive the notification
  • Current values for related metrics used in the email
  • The latest usage value at send time (rather than at event fire time)

That missing context is where architecture matters.

Architecture

When Orb sends subscription.usage_exceeded, Knock's control plane ingests it, enriches it, and triggers a Knock workflow.

flowchart LR
  orb[Orb webhook]
  control[Knock control plane]
  workflow[Knock workflow]
  postmark[Postmark]

  orb --> control --> workflow --> postmark

The workflow then handles branching and template selection:

  • Which plan is the customer on?
  • Which metric crossed threshold?
  • Which threshold was crossed?

Each branch maps to a different template and copy path.

Why payload enrichment is required

The incoming Orb webhook is a trigger, not the complete source of truth.

Knock performs two enrichment steps before triggering messaging:

  1. Internal enrichment: look up account admins/owners to determine recipients.
  2. External enrichment: refetch usage details from Orb so message content reflects current values.

Jeff described the practical need directly:



“The Orb webhook doesn't have all of the information we need to alert the right people. We ingest it, look up account admins and owners, and go back out to Orb to augment the statistics we show in the email.”

Jeff Everhart

Developer Advocate @ Knock



This implementation follows the fetch-before-process pattern, but as a hybrid: the webhook payload has some required data, yet Knock still enriches it with internal context and fresh Orb reads before acting.

Slow ramp vs burstable usage

Knock sees two usage profiles in production:

  • Slow ramp: usage rises steadily, and payload values are usually close to current values.
  • Burstable: large spikes (for example broadcasts), where values become stale quickly.

In burstable scenarios, the payload can already be outdated by the time you render and send the message. Refetching before processing avoids stale numbers in customer-facing communication.

Orb also supports summary webhooks, which intentionally slim payloads and encourage a notification-style integration pattern.

That pattern maps to thin events: notifications that indicate "something changed" and push consumers to refetch current state. As noted above, Knock's implementation keeps that refetch discipline even though Orb's payload still carries useful fields.

The 12-message matrix

After enrichment, Knock resolves message selection across:

  • 2 plans (developer, starter)
  • 2 metrics (out-of-app volume, in-app guide views)
  • 3 thresholds (75%, 90%, 100%)

That is 12 distinct scenarios with different copy and urgency.

The hard part here is not receiving the webhook; it is translating each plan/metric/threshold state into the right message and tone.

Urgency without fear

Billing communication can easily create the wrong outcome if the tone is off.



“Communicate urgency without scaring anybody. We don't want users to feel like we're shutting off service because they crossed a threshold.”

Jeff Everhart

Developer Advocate @ Knock



Knock's approach is collaborative: engineers build the workflow logic and dynamic data paths, while growth/marketing iterates on message copy and templates.

Development workflow that matches production

For local development and demos, Knock mocks Orb payloads rather than trying to trigger real thresholds in a live billing environment.

That gives them:

  • Fast iteration using fixtures for each threshold scenario
  • Deterministic testing for all plan/metric combinations
  • No accidental spend or customer-facing sends

In the episode, Jeff demonstrates this in a non-production, non-sending setup. In Knock's docs, the equivalent iteration loop is covered through the workflow test runner and template preview/testing tools.

In practice, this means teams can QA all 12 scenarios by stepping through workflow runs, branch decisions, and rendered templates before promoting changes.

Shipping safely: environments, branches, and diffs

Knock's version-control model combines environments, branches, and commits/diffs so teams can isolate changes, review what changed, and promote intentionally.

That supports a practical deployment path:

  • Make and validate changes in development
  • Review before/after diffs
  • Promote to production intentionally

API keys then scope where triggers execute, making it straightforward to keep development and production flows isolated. This is how Knock documents environment-specific API keys.

Schema validation at the trigger boundary

One implementation detail worth copying: validate trigger payload shape with JSON Schema before workflow execution, as described in Knock's validating trigger data guide.

In this flow, that protects three things:

  • Templates rely on dynamic payload fields
  • Malformed payloads can cause broken or embarrassing output
  • Early validation surfaces errors before customer-facing channels are touched

If schema validation fails, it can trigger alerts in observability tooling instead of silently corrupting message quality.

Idempotency as a key takeaway

One clear takeaway from the discussion is that idempotency is a critical part of this pattern.

Knock supports an Idempotency-Key header on workflow triggers, which prevents duplicate workflow runs when the same trigger request is retried.

Orb includes an event id in webhook payloads and recommends storing it temporarily to avoid reprocessing duplicates. In this flow, the Orb event ID is the natural idempotency key to pass through.

Inbound and outbound lessons

This use case starts with inbound Orb webhooks, but Knock also emits high-volume outbound message event webhooks (sent, delivered, opened, clicked, bounced, and more).

At scale, outbound events can multiply quickly relative to initial sends, so producer-side controls matter:

  • Retries with exponential backoff
  • Strong observability of message and provider states
  • A hard "disable webhook" control to protect consumers during incidents

That last one is underrated. A producer-side off switch can prevent retry storms from turning a downstream outage into a larger platform incident.

For teams handling webhook volume at scale, Hookdeck Event Gateway helps with reliable ingestion and processing through durable queuing, retries with backoff, replay, and end-to-end observability.

Key takeaways from episode one

This episode shows that handling the webhook is one part of the system; teams also invest significant effort in enrichment, message decisioning, and operational guardrails to keep customer communication accurate as usage changes quickly.

  • Treat webhook payloads as triggers; refetch when data freshness matters.
  • Design for both slow-ramp and burstable usage profiles.
  • Prioritize message-state mapping and copy quality, not just webhook plumbing.
  • Validate payload schema at the workflow boundary.
  • Use provider event IDs as stable idempotency keys for downstream workflow triggers.
  • Build environment and promotion guardrails for non-engineer collaborators.
  • For outbound webhooks, include retries, observability, and a disable or pause mechanism.

Further reading

Watch the full conversation on YouTube. More Webhooks by Example episodes coming soon.