Data migration used to be a long and complicated ordeal. Data loss was a common, very real concern. However, the evolution of data migration tools has made the process faster, easier, and less risky. Let’s take a closer look at data migration and how this once-complex the journey has changed for the better.

Data migration is a one-time process of transferring internal data from one storage system to another. Projects that require data migration range from upgrading a server to moving to a new data center and from launching a new application to integrating the resources of a newly acquired company.

Data migration can sometime be confused with data replication or data integration, but each process is a different kind of data management. Data replication is the periodic copying of data from a data source on one platform to a destination on another, while data integration combines data from disparate sources in a data warehouse destination or analysis tool.

Types of data migration

There are six types of data migration:

  • Storage migration is where an organization moves data from one physical storage location to another
  • Application migration is called for when a business changes software applications or vendors, which requires the data to be moved to a new computing environment
  • Business process migration occurs when business applications and their associated data are moving to a new environment; this is often driven by a company reorganization, merger, or acquisition
  • Data center migration involves existing infrastructure and the data it holds being moved to a new location, or the data being moved onto new infrastructure

The remaining two types of data migration — cloud migration and database migration — merit deeper explanations.

What is cloud migration?

Cloud migration is the fastest-growing type of data migration. It involves moving on-premises data or applications to a cloud environment — common options include public clouds, private clouds, and hybrid clouds, although some organizations also use multi-cloud environments. IT experts predict that the majority of large businesses will be operating in the cloud by 2030.

What is database migration?

Database migration is an example of specialized workload migration. Simple database migration might involve moving from one version of a database management system (DBMS) to a newer version. More complex database migrations involve a move where the source DBMS and the target DBMS have different data structures, also known as schema.

The data migration process

Ideally, moving data to a new platform, location, or architecture can be completed with no data loss, minimal manual data manipulation or re-creation, and little-to-no downtime. The ETL process extracting the data, transforming the data, and then loading the data can be especially helpful with complex migrations that involve huge datasets.
There are three phases to every data migration project:

  • Planning is the most important part of any data migration effort. Key considerations when developing your data migration strategy include data sources and destinations, security, and cost.
  • Migration is the active stage of the process, when data is moved from the source to the destination.
  • Post-migration is the last step, when you check to confirm that the migration was executed correctly.

Data migration strategy

When creating your data migration plan, you can consider either the big bang data migration approach or trickle data migration approach:

  • Big bang data migration. This method moves all the migrated data in one effort. Advantages of this approach include lower cost, a quicker move, and less complexity. However, this type of data migration requires all systems to be offline for the duration of the migration. Additionally, big bang data migration carries a failure risk that increases in relation to the amount of migrated data. The combination of these two factors makes the big bang approach best for smaller businesses with less data to move and the ability to be fully offline during the migration.
  • Trickle data migration. This method migrates data in incremental phases, with the old and new systems running in parallel until the entire migration process is complete. Advantages of this approach include a reduced risk of error or failures, and no requirement for system-wide downtime. However, due to its piecemeal nature, this type of data migration strategy is more complicated and requires more time to plan and execute. Because big businesses with large amounts of data typically can’t afford downtime — but do have the resources for a complex process — trickle data migration is likely the best path for them to pursue.

Data migration tools

Organizations can write their own data migration scripts or use off-the-shelf on-premises or cloud-based tools. Self-scripted data migration is a do-it-yourself, in-house solution that may suit small projects, but it doesn't scale well. On-premises tools work well if all the data storage is contained within a single site. Cloud-based data migration tools may be a better choice for organizations moving data to a cloud-based destination.

Use cases Pros Cons
Self-scripted
  • Small projects
  • Quick fixes
  • Specific source or destination is unsupported by other tools
  • Can be quick to develop
  • May be inexpensive if requirements are simple
  • Coding skills required
  • Changing needs can increase cost
  • Diverts engineers from more strategic tasks
  • Changes can be difficult if code is not well-documented
On-premises
  • Compliance requirements prohibiting cloud-based or multitenant solutions
  • All data sources and destinations are located at a single site
  • Static data requirements with no plans to scale
  • A capex model is preferred over opex
  • IT team has control of full stack from physical to application layers
  • Low latency
  • IT team must manage security and software updates
  • IT team must keep tools up and running
Cloud-based
  • Data sources and/or destinations at multiple sites
  • Need to scale up and down to meet dynamic data requirements
  • Data scientists and business analysts/users at different sites need access to common data warehouses and tools
  • Opex model preferred over capex
  • Agile and scalable enough to handle changing business needs
  • Pay-as-you-go pricing eliminates spending on unused resources
  • On-demand compute power and storage handles demand caused by temporary or bursty events
  • Geographically dispersed users can access data tools
  • Redundant architecture provides the best reliability
  • Security concerns – real or perceived – may lead to internal resistance
  • Solution may not support all required data sources and destinations


IT pros can write software to migrate data, but that process can be taxing, time-consuming, and therefore not cost-efficient. Hand-coding big data integrations may also result in manual integration tasks and re-implementation of machine learning algorithms.

Using data migration software is a better way to go. The software does the heavy lifting, although it’s important that data engineers still understand what data they are migrating, how much will be migrated, and the differences between the source and destination platforms and schemas. In addition to defining the migration strategy and running the migration, they must also test the results and resolve any issues.

How to select the right data migration tool

Selecting a data migration tool is a key component in the planning process, and should be based on the organization's use case and business requirements.

Data sources and destinations

The number and kind of data sources and destinations is an important consideration. Self-scripting may be able to support any source system and destination, but self-scripting is simply not scalable. It may work for small projects, but coding data extraction scripts for hundreds of sources is inefficient and wastes precious IT resources.

One caveat for on-premises tools is that the supported sources and destinations may vary depending on the operating system on which your tool runs. Most on-premises and cloud-based data migration tools have compatibility with a variety of data sources and as well as popular destinations such as AWS and Microsoft. However, cloud-based SaaS tools also don't have OS limitations, and vendors upgrade them to support new versions of sources and destinations automatically.

Reliability

Cloud-based data migration tools have little to no downtime due to their highly redundant architectures. Matching that reliability with on-premises tools is a difficult — if not impossible — ask.

Performance and scalability

Cloud-based migration tools perform exceptionally well. Compute power and storage in the cloud can scale to meet dynamic data migration requirements. On-premises tools cannot automatically scale up and down as needed because they're limited by the hardware on which they run.

Security

Data migration tools may have to meet security and compliance requirements. This may rule out some cloud-based tools, but many are compliant with SOC 2, HIPAA, GDPR, and other governance regulations. Others may offer valuable, related features such as disaster recovery services.

Pricing

Many factors affect pricing, including the amount of data, number and types of sources and destinations, and service level. No particular type of data migration tool will always be the lowest-cost solution for any given data migration project.

Cloud-based data migration tools typically have pay-as-you-go pricing. For most data migration projects, a cloud solution provides the best pricing — and some cloud services even offer a free tier of service for some organizations. Because some of the pricing models can be a bit confusing, however, it’s important to be sure that you’re comparing apples to apples when it comes to cloud-based solutions.

Start your cloud data migration

Planning a data migration or replication? Stitch offers an easy-to-use, cloud-first, ETL tool that can replicate or migrate data from sources to destinations without compromising data quality. Automation makes the job of getting data for analysis faster, easier, and more reliable. Stitch streams all your data directly to your analytics warehouse so that business stakeholders can get the most value from their data analysis and from business intelligence programs that draw on a variety of datasets.

Sign up now for a free Stitch trial and complete your data transfer from source to destination in minutes. It’s fast, easy to get started, and there are no limits on data volume during your trial.

Give Stitch a try, on us

Stitch streams all of your data directly to your analytics warehouse.

Set up in minutesUnlimited data volume during trial