5,000 duplicate records at 3 minutes each to review and merge equals 250 hours of work. That is six weeks of full-time effort for one person doing nothing but merging CRM records. Or roughly 48 hours for a specialist team with purpose-built tooling.
That maths matters because every B2B CRM with more than 5,000 records has duplicates. Typical duplicate rates range from 10% to 30% depending on how many data sources feed the CRM and how long it has gone without deduplication. The question is not whether you have duplicates. It is how many, and whether fixing them manually is a reasonable use of your team's time.
How Duplicates Get into Your CRM
Understanding the source prevents future duplicates while you fix the existing ones.
- Form submissions: A contact fills in a form with their work email. Six months later, they fill in another form with a personal email. The CRM creates two records because the matching field (email) is different.
- CSV imports: A list is imported without checking for existing records. Every contact on the list gets created as new, even if they already exist in the CRM under a slightly different name or email.
- Integration syncs: A marketing tool, enrichment platform, or calling system creates records in the CRM when it cannot find a match. Slight differences in name formatting ("Jon" vs. "Jonathan") prevent matching.
- Manual entry: A rep creates a contact without searching first, or searches but does not find the existing record because the name is spelled differently.
- Company mergers and acquisitions: Two companies merge, and contacts from both end up in the CRM under different company names but the same person.
Fixing Duplicates in HubSpot (Step-by-Step)
Step 1: Use the Built-In Tool
Go to Contacts > Actions > Manage Duplicates. HubSpot scans for duplicate contacts and companies and presents pairs for review.
For each pair, HubSpot shows both records side by side. You choose which record to keep as the primary and which to merge into it. The merged record retains all activity history, associations, and property values from both records.
Step 2: Choose the Right Primary Record
When merging, always keep the record with:
- More recent activity (last email, last call, last form submission)
- More complete data (more populated fields)
- Active deal associations
- A business email address rather than a personal one
If the records have conflicting property values (different phone numbers, different job titles), HubSpot lets you choose which value to keep for each field during the merge. Take the time to choose correctly - a merge cannot be undone.
Step 3: Go Beyond the Built-In Tool
HubSpot's duplicate tool catches obvious matches - identical email addresses and very similar names. It misses:
- Same person with different emails (work vs. personal)
- Same person with name variations ("Rob Smith" and "Robert Smith")
- Same company with different formatting ("ABC Ltd" and "ABC Limited" and "A.B.C. Ltd")
To catch these, export your contact list to CSV and run a fuzzy match on first name + last name + company. Tools like OpenRefine (free) or Dedupe.io can cluster similar records that the CRM missed. Import the results back as a merge list.
Step 4: Prevent Future Duplicates
In HubSpot, go to Settings > Objects > Contacts > Deduplication. Enable automatic deduplication for new records based on email address. For form submissions, enable the "always create or update" setting to update existing records rather than creating new ones.
For imports, always use HubSpot's "Update existing contacts and create new ones" option rather than "Create contacts only". This matches incoming records against existing ones by email before creating anything new.
Fixing Duplicates in Salesforce (Step-by-Step)
Step 1: Enable Duplicate Rules
Go to Setup > Duplicate Management > Duplicate Rules. Salesforce has standard duplicate rules for Leads, Contacts, and Accounts. If they are not active, enable them. If they are active, check the matching rules they use.
Standard matching rules check for exact and fuzzy matches on name and email. You can create custom matching rules that include phone number, company name, or any other field.
Step 2: Run Duplicate Reports
Create a report using the "Contacts & Accounts" report type. Group by email address or by last name + account name. Records that appear multiple times within a group are potential duplicates.
For a more thorough check, use Salesforce's "Potential Duplicates" component on record pages, or install a deduplication app from AppExchange (DemandTools, Duplicate Check, or RingLead are well-established options).
Step 3: Merge Records
In Salesforce, navigate to the account or contact record. Click the dropdown menu and select "Find Duplicates" or "Merge Contacts".
Salesforce lets you select up to three records to merge at once. Choose the primary record (the master), then select which field values to keep from each duplicate. The merge combines all child records (activities, opportunities, cases) under the primary record.
Important Salesforce-specific considerations:
- Custom object relationships: Standard merge handles standard lookups but may not correctly reassign records in custom objects. Check custom relationships after merging.
- Campaign membership: Merging contacts combines campaign membership records. If both duplicates were members of the same campaign with different statuses, only one status survives.
- Lead conversion: If one of the duplicates is a Lead and the other is a Contact, convert the Lead first (matching to the existing Contact) rather than merging, as merge does not work cross-object.
Step 4: Prevent Future Duplicates
Configure duplicate rules to "Alert" or "Block" when a user attempts to create a record that matches an existing one. "Alert" shows a warning but lets the user proceed. "Block" prevents the save entirely.
For integrations, ensure every sync includes a matching step that checks for existing records before creating new ones. Most integration tools (Workato, Tray.io, native Salesforce connectors) support this, but it must be configured explicitly.
The Scale Problem
Manual deduplication works for small volumes. The maths breaks down quickly at scale:
- 500 duplicates: ~25 hours. One person, one week. Reasonable for a DIY effort.
- 2,000 duplicates: ~100 hours. One person, nearly three weeks of full-time work. Questionable ROI on staff time.
- 5,000 duplicates: ~250 hours. One person, six full weeks. Or a specialist team with batch processing tools in roughly 48 hours.
- 10,000+ duplicates: Manual merging is no longer feasible. This requires automated matching algorithms, batch merge tools, and a review-and-approve workflow.
The crossover point where professional deduplication becomes cheaper than DIY is usually around 1,000 to 2,000 duplicates. Below that, a competent CRM admin can handle it during quiet periods. Above that, the staff time cost exceeds the cost of professional CRM cleanup.
Measuring Success
After deduplication, track these metrics monthly:
- New duplicates created per month: If this number is not near zero, your prevention rules need tightening.
- Duplicate rate: Total duplicate pairs divided by total records. Target: under 2%.
- Reps flagging duplicate contacts: The front-line signal that duplicates are re-emerging.
Deduplication without prevention is a temporary fix. The same sources that created the current duplicates - forms, imports, integrations, manual entry - will create new ones at the same rate unless you close the gaps.
When to Call for Help
If your CRM has more than 2,000 duplicate pairs, or if the duplicates involve complex custom object relationships, or if previous merge attempts have created data inconsistencies, manual cleanup is not the right approach. A systematic deduplication using batch tools, fuzzy matching algorithms, and automated conflict resolution handles in 48 hours what would take your team six weeks.
The maths is simple: 5,000 duplicates at 3 minutes each equals 250 hours of your team's time. Or let a specialist handle it in two days. The records end up the same. The cost and timeline do not. If the duplicate problem in your CRM has grown past what manual effort can manage, the most efficient path is to fix it properly once and then put prevention in place so it does not happen again.