Questions to Ask Before Hiring a Zoho Implementation Partner
On this page Choosing the wrong Zoho implementation partner can cost you months of rework…
CRM data cleanup before migration is the step most teams underestimate, and it is also the step that most often determines whether the project succeeds. Moving records from one platform to another does not fix structural problems buried in the source system. Duplicates, inconsistent field formats, empty required fields, and mismatched picklist values all travel with the data unless you address them first. This guide walks through the full pre-migration cleanup process: auditing your current data, deduplicating records, mapping fields to the destination schema, standardising values, and validating the cleaned dataset before you cut over. Examples throughout reference Zoho CRM as the destination platform, though the methodology applies to any CRM switch.
A CRM migration moves records, but the quality of those records depends entirely on what you put in. Teams that skip or rush the cleanup phase typically encounter three categories of failure after go-live.
First, duplicate contacts and accounts fragment the customer view. Sales reps work against each other when one rep’s territory includes records that are exact copies of another rep’s active deals. A single customer might appear as three contacts with slightly different email addresses, each with a partial activity history. Merging them after migration is far more expensive than deduplicating before.
Second, field mapping errors corrupt data silently. If the source system stores “Company Size” as a free-text field and the destination expects a fixed picklist, every free-text value that does not match a picklist option either gets dropped or gets written to a catch-all field. Neither outcome is acceptable for a sales team that uses company size to segment their pipeline.
Third, missing required fields block records from being created. Most modern CRMs enforce required fields at the API level. If your source data has 40% of records without a valid email address and the destination treats email as required, those records will fail to import. The team discovers the gap only when someone reports that 3,000 accounts are missing.
Starting the migration with a clean, validated, fully mapped dataset eliminates all three failure modes. The cleanup phase typically takes two to four weeks for a mid-size CRM with 10,000 to 50,000 records. It is not the exciting part of the project, but it is the part that protects every other investment the organisation makes in the new platform.

Before you can clean data, you need to know exactly what you have. A data audit produces three outputs: a record count by object type, a field completeness report, and a data quality issue log.
Export every object you plan to migrate, including contacts, leads, accounts, deals, activities, and any custom objects. Export in CSV or Excel format so you can work with the data outside the source CRM. Include all fields, even ones you suspect are empty. Create a separate export for each object type and date-stamp the files. You will reference these exports throughout the cleanup process.
For each exported file, calculate the percentage of non-empty values for every column. A simple pivot table or a short Python script can do this in minutes. Flag any field below 60% completeness for review. Fields below 20% completeness for critical data like email, phone, or company name need immediate action: either enrich the data, source the missing values, or decide the records are not worth migrating.
Create a data quality issue log with five columns: object type, field name, issue type (blank, inconsistent format, invalid value, duplicate), estimated record count affected, and recommended action. This log becomes the working document for the cleanup team and provides an audit trail for the migration project.
| Object | Field | Issue | Records Affected | Action |
|---|---|---|---|---|
| Contact | Blank | 1,240 | Enrich or exclude | |
| Account | Industry | Inconsistent values | 3,800 | Standardise to picklist |
| Lead | Phone | Invalid format | 620 | Reformat or blank |
| Contact | Name | Duplicate records | 890 | Deduplicate before export |
CRM data cleanup before migration must include a systematic deduplication pass. Duplicates accumulate over years from manual data entry, multiple import rounds, and API integrations that do not check for existing records before creating new ones.
No single field reliably identifies a duplicate. Email addresses change. Company names have variations (“Acme Corp”, “Acme Corporation”, “ACME Corp”). Phone numbers include country codes inconsistently. Use a combination of match keys and apply fuzzy matching rather than exact match only. Effective match key combinations include:
For datasets under 5,000 records, a spreadsheet with conditional formatting and manual review is sufficient. For larger datasets, use a dedicated deduplication tool. OpenRefine is a free open-source option that handles fuzzy matching well. Python’s dedupe library works programmatically. Some CRM migration tools like Trujay or Data2CRM include deduplication as part of the migration workflow.
When two records are confirmed duplicates, you need a rule for which values to keep. A common approach is “most recently updated field wins” combined with “most data wins for empty fields.” Concretely: if record A has a phone number and record B does not, keep the phone from A. If both records have a phone number, keep the one from the record updated more recently. Document your merge rules before running any automated merge to ensure consistency.
After merging duplicates, audit the surviving records for completeness. A merged contact should have consolidated activity history, not just the data from the winning record. If your source CRM does not merge activity history automatically, export activities for both records and tag them with the surviving record ID before import.

Field mapping is the technical translation layer between your old CRM schema and the new one. Every source field needs an explicit decision: map to a destination field, transform and then map, or exclude.
Create a spreadsheet with one row per source field. Columns should include: source field name, source field type, sample values, destination field name, destination field type, transformation required, and migration status. Work through every object type systematically. This document is the single source of truth for the import configuration.
| Source Field | Source Type | Destination Field | Dest Type | Transformation |
|---|---|---|---|---|
| Company Size | Free text | No. of Employees | Picklist | Map text ranges to picklist values |
| Created Date | Timestamp | Created Time | DateTime | Convert timezone to UTC |
| Deal Value | Number (USD) | Amount | Currency | Set currency code; no format change |
| Lead Source | Picklist (12 values) | Lead Source | Picklist (8 values) | Map unmapped values to “Other” |
| Notes (free text) | Long text | Description | Long text | None — direct map |
Picklist mismatches are the most common source of data loss during migration. The source system may have 15 industry values; the destination has 10. Map each source value to a destination value explicitly. Never let unmapped values default silently to blank. If a source value has no clear destination match, create a new picklist option in the destination CRM or map to a catch-all value like “Other” and document it so the team can review later.
If the source system has fields that do not exist in the destination, decide before migration whether to create custom fields. In Zoho CRM, custom fields can be added to any module and are available immediately in the import process. Creating custom fields ahead of the migration is far simpler than trying to add them mid-import or after go-live. Document every custom field you create: field name, data type, module, and the business reason for keeping it.
Lookup fields, which link a contact to an account or a deal to a contact, require special handling. The import process needs to resolve these relationships using a unique identifier, typically the record ID or email address. If the source system uses integer IDs and the destination generates its own IDs, you need a cross-reference table that maps source IDs to destination IDs. Build this table during the import run and use it to populate lookup fields in a second import pass after all parent records are created.
Even when fields are correctly mapped, inconsistent formatting within a field causes problems downstream. Phone numbers, dates, country names, and currency values each have common standardisation issues.
Strip all formatting characters and store digits only, then apply a consistent format for display. A common approach is E.164 format: country code followed by subscriber number with no spaces or punctuation (+14155552671). If country of origin is unknown for a record, flag it for manual review rather than guessing. Phone numbers written as local numbers without country codes are especially common in older CRM data.
Convert all dates to a consistent format before import. ISO 8601 (YYYY-MM-DD) is the safest format across migration tools and CRM APIs. Pay particular attention to timezone handling for activity timestamps. If the source CRM stored times in local timezone and the destination expects UTC, convert during the extract phase, not during import.
Country names have dozens of valid variants: “United States”, “USA”, “US”, “United States of America”. Most CRM picklists use a fixed ISO country name list. Map every source variant to the correct ISO name. The same applies to state and province fields: full names versus abbreviations vary by market and by CRM vendor.
If the source system stored amounts in multiple currencies without explicit currency codes, flag those records for manual review before migration. Migrating an amount as a number without its currency context makes the value meaningless in the destination. Check whether the source CRM stored exchange rates historically and whether the destination needs those rates or just the converted amount in a base currency.

Validation confirms that the cleaned, mapped dataset will import successfully before you run the production migration. A proper validation pass catches errors that are invisible in spreadsheet review but fatal during import.
Run the cleaned dataset against the destination CRM’s API validation rules before the migration. Most CRM APIs return descriptive errors for invalid values, missing required fields, and type mismatches. Write a validation script that attempts to create records via the API in dry-run mode or uses a sandbox environment. Log every error with the record identifier and field name so the data team can fix issues in the cleaned export rather than in the live system.
Before and after each import run, reconcile record counts. Source system had 12,400 contacts, cleaned dataset has 11,200 contacts (1,200 excluded as duplicates or blanks), destination has 11,198 contacts after import (2 failed). Reconciliation gives the migration team confidence in what went in and a clear action list for what failed. Document the reconciliation numbers in the migration log.
After the test import into a sandbox, manually verify 20 to 30 records across different object types. Include records that required transformation (picklist mapping, phone reformatting) and records that had complex relationships (contacts with multiple associated deals). Check that all fields migrated correctly and that related records link properly. Spot-checking after every test run builds confidence before the production cut-over.
For migrations where business continuity is critical, run the old and new CRM in parallel for one to two weeks after go-live. During this period, continue entering data in both systems and reconcile at the end of the parallel period. This is expensive in terms of effort but provides a safety net if the migration missed something significant. It is most practical for smaller teams or for organisations where a subset of users can run the validation.
Once the cleanup, deduplication, field mapping, and validation are complete, run through this checklist before the production migration.
A well-structured data migration to Zoho or any other CRM starts with exactly this kind of systematic preparation. Skipping any item on this list adds risk proportional to the volume and complexity of your data.
What does CRM data cleanup before migration involve?
CRM data cleanup before migration covers four main activities: auditing existing records for completeness and quality issues, deduplicating contact and account records, building a field mapping document that translates every source field to its destination equivalent, and standardising formats like phone numbers, dates, and picklist values. The goal is a validated, clean dataset that imports without errors into the new CRM.
How long does CRM data cleanup take before a migration?
For a mid-size CRM with 10,000 to 50,000 records across contacts, accounts, and deals, the cleanup phase typically takes two to four weeks. Smaller datasets with good existing data hygiene can be done in one week. Larger or messier datasets, particularly those with many custom objects or years of accumulated duplicates, can take six to eight weeks. Budget time for at least two rounds of test imports before the production cut-over.
What is the best way to find duplicate records in a CRM export?
Use a combination of match keys rather than a single field. For contacts, match on email address (exact), then on first name plus last name plus company name using fuzzy matching at 85% similarity or above. For accounts, match on company domain derived from the email field. Free tools like OpenRefine support fuzzy clustering for datasets up to several hundred thousand records. For programmatic deduplication, the Python dedupe library handles large datasets with configurable match thresholds.
What happens to picklist values that do not exist in the destination CRM?
Unmatched picklist values are handled in one of three ways: create the missing value in the destination CRM before migration, map the source value to the closest existing destination value, or map to a catch-all option like “Other” and flag those records for review after go-live. Never allow unmapped values to silently default to blank. Document every mapping decision in the field mapping document so the business can review and correct after migration if needed.
Should you migrate all historical data or only active records?
Migrating only active records reduces migration scope and cleanup time significantly. A common approach is to migrate all accounts and contacts created in the past three to five years, all open deals regardless of age, and closed-won deals from the past two years. Older inactive records can be archived in a read-only format or kept in the legacy system with a sunset date. Document the data scope decision and get stakeholder sign-off before starting cleanup so the scope does not expand mid-project.
Aaxonix plans and executes CRM data migrations to Zoho CRM, handling audit, cleanup, field mapping, and go-live with a structured methodology that keeps projects on schedule. Book a free consultation to get a no-obligation review of your current CRM data and a migration readiness assessment.
Book a free consultationClean data is the foundation every CRM migration depends on. The time invested in deduplication, field mapping, and validation before go-live pays back immediately: faster user adoption, accurate reporting from day one, and no emergency data fixes in the weeks after launch. Start with the audit, work through each cleanup layer methodically, and validate in a sandbox before touching the production system. If your team needs support scoping or executing the cleanup phase, see how Aaxonix approaches CRM setup after migration for a view of what comes next.
Our team builds systems that actually work. No fluff, just honest architecture and clean implementation.