CRM Data Cleanup Before Migration: Practical Guide | Aaxonix

CRM data cleanup before migration is the step most teams underestimate, and it is also the step that most often determines whether the project succeeds. Moving records from one platform to another does not fix structural problems buried in the source system. Duplicates, inconsistent field formats, empty required fields, and mismatched picklist values all travel with the data unless you address them first. This guide walks through the full pre-migration cleanup process: auditing your current data, deduplicating records, mapping fields to the destination schema, standardising values, and validating the cleaned dataset before you cut over. Examples throughout reference Zoho CRM as the destination platform, though the methodology applies to any CRM switch.

Why Data Quality Determines Migration Success

A CRM migration moves records, but the quality of those records depends entirely on what you put in. Teams that skip or rush the cleanup phase typically encounter three categories of failure after go-live. It is worth remembering that dirty data is a top reason CRMs fail in the months after a project ships, long after the technical migration is signed off.

First, duplicate contacts and accounts fragment the customer view. Sales reps work against each other when one rep’s territory includes records that are exact copies of another rep’s active deals. A single customer might appear as three contacts with slightly different email addresses, each with a partial activity history. Merging them after migration is far more expensive than deduplicating before.

Second, field mapping errors corrupt data silently. If the source system stores “Company Size” as a free-text field and the destination expects a fixed picklist, every free-text value that does not match a picklist option either gets dropped or gets written to a catch-all field. Neither outcome is acceptable for a sales team that uses company size to segment their pipeline.

Third, missing required fields block records from being created. Most modern CRMs enforce required fields at the API level. If your source data has 40% of records without a valid email address and the destination treats email as required, those records will fail to import. The team discovers the gap only when someone reports that 3,000 accounts are missing.

Starting the migration with a clean, validated, fully mapped dataset eliminates all three failure modes. The cleanup phase typically takes two to four weeks for a mid-size CRM with 10,000 to 50,000 records. It is not the exciting part of the project, but it is the part that protects every other investment the organisation makes in the new platform.

Office employees collaborate on financial data at modern workspace, engaging in teamwork and communication. — Photo by Kampus Production · Pexels

Auditing Your Existing CRM Data

Before you can clean data, you need to know exactly what you have. A data audit produces three outputs: a record count by object type, a field completeness report, and a data quality issue log.

Step 1: Export a full data snapshot

Export every object you plan to migrate, including contacts, leads, accounts, deals, activities, and any custom objects. Export in CSV or Excel format so you can work with the data outside the source CRM. Include all fields, even ones you suspect are empty. Create a separate export for each object type and date-stamp the files. You will reference these exports throughout the cleanup process.

Step 2: Build a field completeness report

For each exported file, calculate the percentage of non-empty values for every column. A simple pivot table or a short Python script can do this in minutes. Flag any field below 60% completeness for review. Fields below 20% completeness for critical data like email, phone, or company name need immediate action: either enrich the data, source the missing values, or decide the records are not worth migrating.

Step 3: Document data quality issues

Create a data quality issue log with five columns: object type, field name, issue type (blank, inconsistent format, invalid value, duplicate), estimated record count affected, and recommended action. This log becomes the working document for the cleanup team and provides an audit trail for the migration project.

Object	Field	Issue	Records Affected	Action
Contact	Email	Blank	1,240	Enrich or exclude
Account	Industry	Inconsistent values	3,800	Standardise to picklist
Lead	Phone	Invalid format	620	Reformat or blank
Contact	Name	Duplicate records	890	Deduplicate before export

Deduplication: Finding and Merging Duplicate Records

CRM data cleanup before migration must include a systematic deduplication pass. Duplicates accumulate over years from manual data entry, multiple import rounds, and API integrations that do not check for existing records before creating new ones.

Identify duplicates using multiple match keys

No single field reliably identifies a duplicate. Email addresses change. Company names have variations (“Acme Corp”, “Acme Corporation”, “ACME Corp”). Phone numbers include country codes inconsistently. Use a combination of match keys and apply fuzzy matching rather than exact match only. Effective match key combinations include:

Email address (exact match) for contacts
First name + last name + company name (fuzzy, 85%+ similarity) for contacts
Company domain derived from email for accounts
Phone number with formatting stripped (digits only) for contacts and leads

Choose a deduplication tool

For datasets under 5,000 records, a spreadsheet with conditional formatting and manual review is sufficient. For larger datasets, use a dedicated deduplication tool. OpenRefine is a free open-source option that handles fuzzy matching well. Python’s dedupe library works programmatically. Some CRM migration tools like Trujay or Data2CRM include deduplication as part of the migration workflow.

Merge strategy: which record wins

When two records are confirmed duplicates, you need a rule for which values to keep. A common approach is “most recently updated field wins” combined with “most data wins for empty fields.” Concretely: if record A has a phone number and record B does not, keep the phone from A. If both records have a phone number, keep the one from the record updated more recently. Document your merge rules before running any automated merge to ensure consistency.

Handle surviving records

After merging duplicates, audit the surviving records for completeness. A merged contact should have consolidated activity history, not just the data from the winning record. If your source CRM does not merge activity history automatically, export activities for both records and tag them with the surviving record ID before import.

From below of monitor of modern computer with opened files on blue screen — Photo by Brett Sayles · Pexels

Field Mapping: Aligning Source to Destination Schema

Field mapping is the technical translation layer between your old CRM schema and the new one. Every source field needs an explicit decision: map to a destination field, transform and then map, or exclude.

Build a field mapping document

Create a spreadsheet with one row per source field. Columns should include: source field name, source field type, sample values, destination field name, destination field type, transformation required, and migration status. Work through every object type systematically. This document is the single source of truth for the import configuration.

Source Field	Source Type	Destination Field	Dest Type	Transformation
Company Size	Free text	No. of Employees	Picklist	Map text ranges to picklist values
Created Date	Timestamp	Created Time	DateTime	Convert timezone to UTC
Deal Value	Number (USD)	Amount	Currency	Set currency code; no format change
Lead Source	Picklist (12 values)	Lead Source	Picklist (8 values)	Map unmapped values to “Other”
Notes (free text)	Long text	Description	Long text	None — direct map

Handle picklist mismatches

Picklist mismatches are the most common source of data loss during migration. The source system may have 15 industry values; the destination has 10. Map each source value to a destination value explicitly. Never let unmapped values default silently to blank. If a source value has no clear destination match, create a new picklist option in the destination CRM or map to a catch-all value like “Other” and document it so the team can review later.

Custom fields in the destination

If the source system has fields that do not exist in the destination, decide before migration whether to create custom fields. In Zoho CRM, custom fields can be added to any module and are available immediately in the import process. Creating custom fields ahead of the migration is far simpler than trying to add them mid-import or after go-live. Document every custom field you create: field name, data type, module, and the business reason for keeping it.

Relationship fields and lookups

Lookup fields, which link a contact to an account or a deal to a contact, require special handling. The import process needs to resolve these relationships using a unique identifier, typically the record ID or email address. If the source system uses integer IDs and the destination generates its own IDs, you need a cross-reference table that maps source IDs to destination IDs. Build this table during the import run and use it to populate lookup fields in a second import pass after all parent records are created.

Standardising Formats and Normalising Values

Even when fields are correctly mapped, inconsistent formatting within a field causes problems downstream. Phone numbers, dates, country names, and currency values each have common standardisation issues.

Phone numbers

Strip all formatting characters and store digits only, then apply a consistent format for display. A common approach is E.164 format: country code followed by subscriber number with no spaces or punctuation (+14155552671). If country of origin is unknown for a record, flag it for manual review rather than guessing. Phone numbers written as local numbers without country codes are especially common in older CRM data.

Date and time fields

Convert all dates to a consistent format before import. ISO 8601 (YYYY-MM-DD) is the safest format across migration tools and CRM APIs. Pay particular attention to timezone handling for activity timestamps. If the source CRM stored times in local timezone and the destination expects UTC, convert during the extract phase, not during import.

Country and region fields

Country names have dozens of valid variants: “United States”, “USA”, “US”, “United States of America”. Most CRM picklists use a fixed ISO country name list. Map every source variant to the correct ISO name. The same applies to state and province fields: full names versus abbreviations vary by market and by CRM vendor.

Currency fields

If the source system stored amounts in multiple currencies without explicit currency codes, flag those records for manual review before migration. Migrating an amount as a number without its currency context makes the value meaningless in the destination. Check whether the source CRM stored exchange rates historically and whether the destination needs those rates or just the converted amount in a base currency.

Man reviewing business charts on a laptop from an over-the-shoulder angle. — Photo by RDNE Stock project · Pexels

Data Validation and Pre-Migration Testing

Validation confirms that the cleaned, mapped dataset will import successfully before you run the production migration. A proper validation pass catches errors that are invisible in spreadsheet review but fatal during import.

Schema validation

Run the cleaned dataset against the destination CRM’s API validation rules before the migration. Most CRM APIs return descriptive errors for invalid values, missing required fields, and type mismatches. Write a validation script that attempts to create records via the API in dry-run mode or uses a sandbox environment. Log every error with the record identifier and field name so the data team can fix issues in the cleaned export rather than in the live system.

Count reconciliation

Before and after each import run, reconcile record counts. Source system had 12,400 contacts, cleaned dataset has 11,200 contacts (1,200 excluded as duplicates or blanks), destination has 11,198 contacts after import (2 failed). Reconciliation gives the migration team confidence in what went in and a clear action list for what failed. Document the reconciliation numbers in the migration log.

Spot-check critical records

After the test import into a sandbox, manually verify 20 to 30 records across different object types. Include records that required transformation (picklist mapping, phone reformatting) and records that had complex relationships (contacts with multiple associated deals). Check that all fields migrated correctly and that related records link properly. Spot-checking after every test run builds confidence before the production cut-over.

Run a parallel period if possible

For migrations where business continuity is critical, run the old and new CRM in parallel for one to two weeks after go-live. During this period, continue entering data in both systems and reconcile at the end of the parallel period. This is expensive in terms of effort but provides a safety net if the migration missed something significant. It is most practical for smaller teams or for organisations where a subset of users can run the validation.

Post-Cleanup Checklist Before You Go Live

Once the cleanup, deduplication, field mapping, and validation are complete, run through this checklist before the production migration.

All duplicate records merged and surviving records audited for completeness
Field mapping document finalised and signed off by the business stakeholder for each object type
All picklist values in source data mapped to valid destination values
Phone numbers, dates, and country fields formatted to destination standard
Required fields in the destination CRM populated for all records in the cleaned dataset
Relationship cross-reference table built and validated for all lookup fields
Test import completed in sandbox with zero critical errors
Record count reconciliation documented and discrepancies explained
Spot-check of 20-plus records completed across all object types
Rollback plan documented: what to do if the production migration fails mid-run
Go-live window agreed, legacy system read-only access planned, users notified

A well-structured data migration to Zoho or any other CRM starts with exactly this kind of systematic preparation. Skipping any item on this list adds risk proportional to the volume and complexity of your data.

Frequently Asked Questions

What does CRM data cleanup before migration involve?

CRM data cleanup before migration covers four main activities: auditing existing records for completeness and quality issues, deduplicating contact and account records, building a field mapping document that translates every source field to its destination equivalent, and standardising formats like phone numbers, dates, and picklist values. The goal is a validated, clean dataset that imports without errors into the new CRM.

How long does CRM data cleanup take before a migration?

For a mid-size CRM with 10,000 to 50,000 records across contacts, accounts, and deals, the cleanup phase typically takes two to four weeks. Smaller datasets with good existing data hygiene can be done in one week. Larger or messier datasets, particularly those with many custom objects or years of accumulated duplicates, can take six to eight weeks. Budget time for at least two rounds of test imports before the production cut-over.

What is the best way to find duplicate records in a CRM export?

Use a combination of match keys rather than a single field. For contacts, match on email address (exact), then on first name plus last name plus company name using fuzzy matching at 85% similarity or above. For accounts, match on company domain derived from the email field. Free tools like OpenRefine support fuzzy clustering for datasets up to several hundred thousand records. For programmatic deduplication, the Python dedupe library handles large datasets with configurable match thresholds.

What happens to picklist values that do not exist in the destination CRM?

Unmatched picklist values are handled in one of three ways: create the missing value in the destination CRM before migration, map the source value to the closest existing destination value, or map to a catch-all option like “Other” and flag those records for review after go-live. Never allow unmapped values to silently default to blank. Document every mapping decision in the field mapping document so the business can review and correct after migration if needed.

Should you migrate all historical data or only active records?

Migrating only active records reduces migration scope and cleanup time significantly. A common approach is to migrate all accounts and contacts created in the past three to five years, all open deals regardless of age, and closed-won deals from the past two years. Older inactive records can be archived in a read-only format or kept in the legacy system with a sunset date. Document the data scope decision and get stakeholder sign-off before starting cleanup so the scope does not expand mid-project.

Aaxonix plans and executes CRM data migrations to Zoho CRM, handling audit, cleanup, field mapping, and go-live with a structured methodology that keeps projects on schedule. Book a free consultation to get a no-obligation review of your current CRM data and a migration readiness assessment.

Book a free consultation

Clean data is the foundation every CRM migration depends on. The time invested in deduplication, field mapping, and validation before go-live pays back immediately: faster user adoption, accurate reporting from day one, and no emergency data fixes in the weeks after launch. The opposite is just as true, since bad data kills user adoption when reps stop trusting what they see on screen. Start with the audit, work through each cleanup layer methodically, and validate in a sandbox before touching the production system. If your team needs support scoping or executing the cleanup phase, see how Aaxonix approaches CRM setup after migration for a view of what comes next.

CRM Data Cleanup Before Migration: How to Prepare Your Records for a Clean Start

Why Data Quality Determines Migration Success