Full Table Replication is a replication method in which all rows in a table - including new, updated, and existing - are replicated during every replication job. In this guide, we’ll cover:

  1. How it works (with examples),
  2. When it should be used, and
  3. Limitations of this Replication Method

How Full Table Replication works

Tables that use Full Table Replication are replicated in full during each replication job. Regardless of whether a record is new or simply modified, all records in the table will be selected for extraction.

If Full Table Replication were a SQL query, it would look like this:

SELECT column_you_selected_1,
       column_you_selected_2,
       [...]
  FROM schema.table

When Full Table Replication should be used

Full Table Replication may be a good fit if:

  1. Records are hard deleted from the source.
  2. The table doesn’t contain a suitable column for Key-based Incremental Replication.
  3. Log-based Incremental Replication is unavailable for the source.

Limitations of Full Table Replication

Before you select Full Table Replication as the Replication Method for a table, you should be aware of the limitations this method can have. Being aware of these limitations can help prevent data discrepancies and ensure your data is replicated in the most efficient manner possible.

The limitations of Full Table Replication are:

Limitation 1: Can cause latency

How large a source table is - that is, how many records the table contains - can affect how quickly Stitch is able to extract data from a source.

In the case of large tables using Full Table Replication, Stitch can only extract data as quickly as it is returned. This means that if a database or SaaS application returns data slowly, especially for a large table, latency in the replication process may increase. This is more probable with tables using Full Table Replication.

Limitation 2: Increased row consumption

Tables using Full Table Replication are replicated fully during every replication job, regardless of whether individual records were updated or not.

The more records a table contains, the more rows that will count towards usage. When paired with a high Replication Frequency, a single table can quickly consume an entire month’s row quota.

For example: A table contains 10,000 records and is using Full Table Replication. The integration’s Replication Frequency is every 30 minutes. The table below shows the number of rows replicated for the table per job as well as the total number used since the first job:

Job name Start time Rows replicated this job Total rows replicated
Job 1 00:00 10,000 10,000
Job 2 00:30 10,000 20,000
Job 3 01:00 10,000 30,000
Job 4 01:30 10,000 40,000
Job 5 02:00 10,000 50,000

If the integration were to continue replicating every 30 minutes until 11:59:59, this table would use 480,000 rows in 24 hours. Depending on the Stitch plan you’re using, this type of usage can quickly use up your quota and cause potential overages.

Limitation 3: Unavailable for some integrations

Currently, Full Table Replication is unavailable for MongoDB integrations. MongoDB only supports Key-based Incremental Replication.

Full Table Replication is supported for all other database and SaaS integrations.



Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.

Tags: replication