Deduplication

View as Markdown

Deduplication ignores redundant events by comparing them against previously processed events within a time window. You control what makes events "identical" - from exact payload matching to comparing only specific fields that matter to your use case.

Deduplication addresses common scenarios where redundant events cause unnecessary processing:

  • Multi-app installations: Deduplicate by request ID when multiple apps send the same webhook
  • Noisy updates: Deduplicate by everything except volatile fields like inventory or timestamps
  • Request redelivery: Remove duplicate webhooks and events that are sent multiple times by the producer

Events identified as duplicates are ignored and not delivered to the destination. You can view events that were ignored as part of your Requests.

Deduplication strategies

The feature offers two deduplication strategies within a 1 second to 1 hour window:

  • Exact deduplication: The entire event is the key
  • Field-based deduplication: Choose which fields define the key (via inclusion or exclusion)

Deduplication is a best-effort feature and is not guaranteed. Always implement idempotent request handling in your destination.

Exact deduplication

The entire payload serves as the deduplication key. Events must be identical to be considered duplicates.

{
  "type": "deduplication",
  "window": 60000
}

Use when: You only want to drop perfectly identical webhooks, such as retry storms.

Field-based deduplication

Define the deduplication key by either including specific fields or excluding volatile ones.

Include fields

Only specified fields serve as the deduplication key. Events with matching values in these fields are considered duplicates.

{
  "type": "deduplication",
  "window": 300000,
  "include_fields": ["headers.x-request-id"]
}

Use when: You have a unique identifier that defines duplicate events, like request IDs or composite keys.

Exclude fields

Everything except specified fields serves as the deduplication key.

{
  "type": "deduplication",
  "window": 300000,
  "exclude_fields": ["body.updated_at", "body.inventory_quantity"]
}

Use when: You want to ignore events where only non-essential fields changed.

Field path resolution

When using field-based deduplication:

  • For a field to be valid, it must start with headers, body, query, or path
  • Fields not present in the payload are treated as empty strings
  • Objects and arrays are converted to JSON strings for comparison
  • For non-JSON bodies (e.g., XML), body resolves to full content, but body.field resolves to empty string
  • Booleans, numbers, and strings are compared as their respective types

Limitations

  • Time windows are limited from 1 second to 1 hour
  • Changing the time window resets deduplication state
  • Deduplication is not guaranteed due to distributed system constraints. You should still implement idempotent request handling in your destination.

Create a deduplication rule

Deduplication rules are applied to a connection, just like any other rule.

  1. Open the connection rules configuration.
  2. Click Add Rule and select Deduplication.
  3. Configure the deduplication settings:
    • Set the Time Window (1 second to 1 hour)
    • Select a Strategy:
      • Exact: Compare entire payloads
      • Include fields: Specify fields to use as the key
      • Exclude fields: Specify fields to ignore
  4. Click Save to apply your changes.
POST /2025-07-01/connections
{
  "name": "shopify-products",
  "source_id": "src_xyz",
  "destination_id": "dest_abc",
  "rules": [
    {
      "type": "deduplication",
      "window": 300000,
      "exclude_fields": ["body.updated_at", "body.inventory_quantity"]
    }
  ]
}

Validation rules:

  • window: Required, between 60000ms (1 min) and 3600000ms (1 hour)
  • Cannot specify both include_fields and exclude_fields
  • Field paths must start with headers, body, query, or path

From this point forward, redundant events received on the connection are removed before delivery.

Edit a deduplication rule

Editing a deduplication rule changes how redundant events are detected.

  1. Open the connection rules configuration.
  2. Click the deduplication rule to edit.
  3. Modify the time window or deduplication strategy.
  4. Click Save to apply your changes.

Changing the deduplication window resets the deduplication state. All historical events are ignored, and deduplication starts fresh.

Delete a deduplication rule

To delete a deduplication rule, follow the instructions for configuring connection rules and click the trash icon to remove the deduplication rule from the connection rules.

Example scenarios

Scenario 1: Multi-app webhook deduplication

Deduplicate webhooks from multiple app installations using request ID:

{
  "type": "deduplication",
  "window": 60000,
  "include_fields": ["headers.x-request-id"]
}

Only the request ID determines if events are redundant, regardless of other payload differences.

Scenario 2: Filtering noisy Shopify updates

Ignore product updates where only inventory or timestamps changed:

{
  "type": "deduplication",
  "window": 300000,
  "exclude_fields": [
    "body.variants[].inventory_quantity",
    "body.updated_at",
    "headers.x-shopify-webhook-id"
  ]
}

Events are considered redundant if everything except these volatile fields matches within 5 minutes.

Scenario 3: Multi-tenant composite keys

Use multiple fields to create tenant-specific deduplication:

{
  "type": "deduplication",
  "window": 600000,
  "include_fields": [
    "body.store_id",
    "body.product_id",
    "body.action"
  ]
}

Events are redundant only when the same store, product, and action combination occurs within 10 minutes.

Deduplication Patterns ->

Explore common deduplication strategies and their use cases.