Replication Methods define the approach Stitch takes when extracting data from a source during a replication job. Additionally, Replication Methods can also impact how data is loaded into your destination and your overall row usage.


Replication Method types

For any table you set to replicate, Stitch will use one of three methods to replicate your data:

Log-based Incremental Replication

Log-based Incremental Replication is a replication method in which Stitch identifies modifications to records - including inserts, updates, and deletes - using a database’s binary log files.

Note: This Replication Method is available only for Amazon DynamoDB, Microsoft SQL Server, MySQL, Oracle, and PostgreSQL-backed databases that support binary log replication, and requires manual intervention when table structures change. Learn more about Log-based Incremental Replication here.

Key-based Incremental Replication

Key-based Incremental Replication is a replication method in which Stitch identifies new and updated data using a column called a Replication Key.

For example: Stitch will use a column like updated_at to identify records that have been updated since a specified time, and then only replicate those records.

If Log-based Incremental Replication isn’t feasible or availble for a data source, Key-based Incremental Replication is the next best option. Learn more about Key-based Incremental Replication here.

Full Table Replication

Full Table Replication is a replication method in which all rows in a table - including new, updated, and existing - are replicated during every replication job.

If a table doesn’t have a column suitable for Key-based Incremental or if Log-based Incremental is unavailable, this method will be used to replicate data. Learn more about Full Table Replication here.


Compare Replication Methods

The table below contains a high-level look at each of Stitch’s Replication Methods and how they compare to each other.

Note: This is not intended as a substitute for the full documentation for each Replication Method. Stitch recommends reading the documentation linked below before selecting a Replication Method, as defining replication incorrectly can lead to data discrepancies, latency, and increased row usage.

Full Table Key-based Incremental Log-based Incremental
Documentation Documentation Documentation Documentation
Availability

All integrations except MongoDB v11-01-2016

All integrations

Select Amazon DynamoDB, Microsoft SQL Server, MongoDB, MySQL, Oracle, and PostgreSQL-backed database integrations

Soft
deletes
Supported Supported Supported
Hard
deletes
Sometimes supported Not supported Sometimes supported
View
support

Supported

Supported

Unsupported

Structural
changes

Detected and automatically handled by Stitch

Detected and automatically handled by Stitch

Require manual intervention

Configuration
requirements

None

A column with one of the following data types that’s suitable as a Replication Key:

  • DATETIME

  • INTEGER

  • TIMESTAMP

  • FLOAT

  • INT64

  • NUMBER

  • OBJECTID

  • UUID

Note: Refer to the MongoDB Replication Keys guide for considerations specific to MongoDB.

An Amazon DynamoDB, Microsoft SQL Server, MongoDB, MySQL, Oracle, or PostgreSQL-backed database that:

  1. Supports database log replication
  2. Can be configured to use Stitch’s required database settings

Define a table's Replication Method

How Replication Methods are defined depends on the type of integration being used:

  • Database integrations: Replication Methods are defined by you when tables are set to replicate. A table’s Replication Method can be changed at any time in the Table Settings page.

  • SaaS integrations: Stitch pre-defines the Replication Methods used for every table set to replicate, with the exception of the following integrations:

    To learn more about the Replication Methods used by a particular SaaS integration, refer to the Schema section in the integration’s documentation.

  • Webhook integrations: Because webhook data is sent to Stitch in real-time, only new records are ever replicated from a webhook source. This can be thought of as using Key-based Incremental Replication with a Replication Key of created_at.



Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.