When you add a new column to a table in your data source, what happens in Stitch? What about syncing additional columns on already-syncing tables? Depending on the type of integration and the Replication Method the table uses, there are a few possible outcomes.
New Columns in the Data Source & Stitch
When a new column is added in a data source, Stitch will first need to perform a structure sync to detect it.
- It may take some time for Stitch to detect the new column and display it in the Integration Details page.
- The following applies to both database and non-webhook SaaS integrations.
After the structure sync completes, the column will be automatically synced and, if the integration supports whitelisting columns, display in the Integration Details page. Data for the column will then replicate based on the table’s Replication Method.
If the table isn’t currently selected to sync, the new column will display inside Stitch (if the integration supports whitelisting columns) but will have to be manually set to sync.
Syncing Additional Columns on Already-Syncing Tables
In this case, we aren’t talking about brand new columns in the data source, but previously existing columns that have never been set to sync in Stitch. How Stitch handles the syncing of additional rows depends on the table’s Replication Method.
Full Table Replication
For tables using Full Table Replication, data in the newly-synced column will be available for all rows - including new and existing - the next time the table is successfully replicated.
For tables using Incremental Replication, data in the newly-synced column will be available only for rows added AFTER the column is synced. Existing rows must be backfilled to make the data available.
Getting newly-synced column data into existing rows requires a full re-ync of the table. Because this can significantly impact your row count and we don’t want to re-replicate data without your say-so, we leave inserting newly-synced column data into existing rows up to you.
Backfilling existing rows
If you need to backfill already-replicated rows with data from the newly selected column, you can reset the table’s Replication Key. This will force a full re-replication of the table, populate the column in existing rows, and replicate new records.
Important: Before resetting Replication Keys:
- Will delete and re-create your destination tables with a full re-replication of your source data.
- Will lead to increased row counts which will count towards your limit.
- Cannot be interrupted or reversed once confirmed.
If you have questions or concerns about resetting Replication Keys, reach out to support before proceeding.
Resetting database integration Replication Keys
Replication Keys in database integrations can be reset at the integration or the table level.
- At the integration level, the reset will clear the replication key value for ALL tables AND queue a full re-replication for all tables in the integration.
- At the table level, the reset will clear the replication key value AND queue a full re-replication for that table only.
To reset Replication Keys, do the following:
- Click into the integration from the Stitch Dashboard page.
- To reset the entire integration: Click the Settings link and skip to step 3.
- To reset a table: Locate the table you want and click into it. Click the Table Settings link, located near the top right corner, and proceed to step 3.
- Scroll down to the Reset Replication Keys section.
- Click the Reset Keys button.
- When prompted, click OK to confirm.
- A Success! message will display at the top of the page.
At this point, a full re-replication of the integration or table will be queued. Note: If there is a large volume of data to be replicated, it may take some time before you see the changes in your data warehouse.
Resetting SaaS integration Replication Keys
Resetting the Replication Keys for a SaaS integration is done by changing the Historical Sync date in the Integration Settings page. When this date is changed, all saved values will be overwritten AND a full re-replication of the integration will be queued.
Note: This feature may not be available for some integrations. Because this approach uses date-based replication, some integrations may be incompatible. For example: Pardot doesn’t support date-based replication, meaning this feature will not be available for Pardot connections.
Changing the Historical Sync date has its own set of considerations and gotchas. Please refer to the Syncing Historical SaaS Data guide for more info.