Deduplication
Deduplication removes redundant events by comparing them against previously processed events within a configurable time window. You control what makes events "identical" - from exact payload matching to comparing only specific fields that matter to your use case.
Deduplication addresses common scenarios where redundant events cause unnecessary processing:
- Multi-app installations: Deduplicate by request ID when multiple apps send the same webhook
- Noisy updates: Deduplicate by everything except volatile fields like inventory or timestamps
- Request redelivery: Remove duplicate webhooks and events that are sent multiple times by the producer
Events identified as duplicates are ignored and not delivered to the destination. You can view events that were ignored as part of your Requests.
Deduplication strategies
Hookdeck offers two deduplication strategies within a 1 second to 1 hour time window:
- Exact deduplication: The entire event is the key
- Field-based deduplication: Choose which fields define the key (via inclusion or exclusion)
Deduplication is a best-effort feature and is not guaranteed. Always implement idempotent request handling in your destination.
Exact deduplication
The entire payload serves as the deduplication key. Events must be identical to be considered duplicates.
{
"type": "deduplicate",
"window": 60000
}
Use when: You want to drop perfectly identical webhooks, such as retry storms.
Field-based deduplication
Define the deduplication key by either including specific fields or excluding volatile ones.
Include fields
Only specified fields serve as the deduplication key. Events with matching values in these fields are considered duplicates.
{
"type": "deduplicate",
"window": 300000,
"include_fields": ["headers.x-request-id"]
}
Use when: You have a unique identifier that defines duplicate events, like request IDs or composite keys.
Exclude fields
Everything except specified fields serves as the deduplication key.
{
"type": "deduplicate",
"window": 300000,
"exclude_fields": ["body.updated_at", "body.inventory_quantity"]
}
Use when: You want to ignore events where only non-essential fields changed.
Time window
The time window defines how long Hookdeck remembers previously seen events when applying deduplication rules.
When an event arrives, Hookdeck computes a hash based on your deduplication strategy. Hookdeck then checks whether the same hash has been seen within the configured time window:
- If a match is found: The new event is discarded and marked as a duplicate
- If no match is found: The event is delivered and the hash is stored for the duration of the window
Events are automatically evicted from the deduplication cache once their time window has elapsed.
Configuration
You can set the time window between 1 second and 1 hour.
Choosing a window size
Shorter windows (e.g., 1 minute)
Allow legitimate retries through after a short period. However, retries that happen after the window expires will be delivered again.Longer windows (e.g., 1 hour)
Better suppress retries that might occur well after the original event. Be cautious: if your source system legitimately emits multiple events with the same identifiers in that timeframe, they may be discarded as duplicates.
Changing the deduplication configuration resets the deduplication cache for the connection, so all events are treated as new from that point forward.
Field path resolution
When using field-based deduplication:
- Field paths must start with
headers
,body
,query
, orpath
- Fields not present in the payload are treated as empty strings
- Objects and arrays are converted to JSON strings for comparison
- For non-JSON bodies (e.g., XML),
body
resolves to full content, butbody.field
resolves to empty string - Booleans, numbers, and strings are compared as their respective types
Limitations
- Time windows are limited to 1 second to 1 hour
- Changing the deduplication configuration resets the deduplication cache for the connection
Deduplication is not guaranteed due to distributed system constraints. Always implement idempotent request handling in your destination.
Create a deduplication rule
Apply deduplication rules to a connection, just like any other rule.
- Open the connection rules configuration.
- Click and select Deduplication.
- Configure the deduplication settings:
- Set the Time Window (1 second to 1 hour)
- Select a Strategy:
- Exact: Compare entire payloads
- Include fields: Specify fields to use as the key
- Exclude fields: Specify fields to ignore
- Click to apply your changes.
POST /2025-07-01/connections
{
"name": "shopify-products",
"source_id": "src_xyz",
"destination_id": "dest_abc",
"rules": [
{
"type": "deduplicate",
"window": 300000,
"exclude_fields": ["body.updated_at", "body.inventory_quantity"]
}
]
}
Validation rules:
window
: Required, between 60000ms (1 min) and 3600000ms (1 hour)- Cannot specify both
include_fields
andexclude_fields
- Field paths must start with
headers
,body
,query
, orpath
Redundant events received on the connection are now removed before delivery.
Edit a deduplication rule
Edit a deduplication rule to change how redundant events are detected.
- Open the connection rules configuration.
- Click the deduplication rule to edit.
- Modify the time window or deduplication strategy.
- Click to apply your changes.
Changing the deduplication configuration resets the deduplication state. All historical events are ignored, and deduplication starts fresh.
Delete a deduplication rule
Delete a deduplication rule by following the instructions for configuring connection rules and clicking the trash icon to remove the deduplication rule from the connection rules.
Example scenarios
Scenario 1: Multi-app webhook deduplication
Deduplicate webhooks from multiple app installations using request ID:
{
"type": "deduplicate",
"window": 60000,
"include_fields": ["headers.x-request-id"]
}
Only the request ID determines if events are redundant, regardless of other payload differences.
Scenario 2: Filtering noisy Shopify updates
Ignore product updates where only inventory or timestamps changed:
{
"type": "deduplicate",
"window": 300000,
"exclude_fields": [
"body.variants[].inventory_quantity",
"body.updated_at",
"headers.x-shopify-webhook-id"
]
}
Events are considered redundant if everything except these volatile fields matches within 5 minutes.
Scenario 3: Multi-tenant composite keys
Use multiple fields to create tenant-specific deduplication:
{
"type": "deduplicate",
"window": 600000,
"include_fields": [
"body.store_id",
"body.product_id",
"body.action"
]
}
Events are redundant only when the same store, product, and action combination occurs within 10 minutes.
Deduplication Patterns ->
Explore common deduplication strategies and their use cases.