Skip to main content

Deduplicator overview

Deduplicator

Deduplicator helps users identify repeated records and reduce duplication in a dataset.

What Deduplicator is for

Use Deduplicator when the problem is primarily about this specific kind of work. The goal is to solve one focused task well instead of asking users to fit every workflow into a generic process.

Typical value

Deduplicator can help teams:

  • reduce manual spreadsheet work
  • apply the same logic more consistently
  • make outcomes easier to review and explain
  • prepare data for downstream reporting, import, sharing, or review

Common examples

  • find repeated Records caused by multiple exports or imports
  • flag Records that appear to represent the same entity more than once
  • produce a cleaner working dataset before reporting or import

Typical workflow

A common Deduplicator workflow is:

  1. review the input data
  2. create or select a configuration
  3. start the Run
  4. inspect the output
  5. adjust the configuration if needed and run again

Next pages

  • When to use Deduplicator
  • Create a deduplicator configuration
  • Run Deduplicator
  • Deduplicator examples