Getting Started with Stitch Enterprise

Part Four

Replication

From selecting the data you want, to detailed replication reports, Stitch’s flexibility puts replication control into the hands of you and your team.

With Stitch, we have complete control over our data consolidation process. We’re able to view the schema used by Stitch to sync Netsuite data, and select exactly which tables get synced.

Saqib Bedi

Product Manager, charity: water

Select only the data you want

For many of our integrations, you can configure exactly the data Stitch replicates by selecting the tables, fields, and endpoints you want in your data warehouse. Choose full or incremental loads and define how often you want a replication job to run — from every minute to once every 24 hours.

For databases, Stitch automatically detects all of the databases, schemas, and tables we have access to and display them in-app for selection. To see the available data for our SaaS integrations, refer to the Schema section of our SaaS integration documentation.

You can find more information about selecting data, Stitch’s replication methods, and how to define the replication frequency in our documentation.

Complete historical data

After an integration is initially connected, Stitch will automatically detect and replicate all historical data. For SaaS integrations, you can define the date from which Stitch should begin replicating historical data during the initial setup of the integration. Any data newer than this date will be automatically selected for replication, eliminating the need for manually backfilling data.

Transformations

Our goal is to get your data from source to destination in a useful, raw format. We define useful as data types and structures that are easy to work with, and raw as staying as close to the original representation of the data as possible. We believe that what you do with raw data once you have it depends on your needs.

For these reasons, Stitch’s data pipeline performs only the transformations necessary to load data into a given destination. While the specific transformations will depend on the destination you choose, they may include translating one database’s data type into another’s supported type or breaking nested structures into subtables.

You can find more information about Stitch’s de-nesting of nested structures and detailed loading behavior guides for each of our destinations in our documentation.

Automatic handling of schema changes

We built Stitch to resiliently handle schema changes and minimize any interruptions to replication. Should a change cause issues with loading your data, Stitch will notify you so the problem can be quickly addressed.

Our documentation goes into more detail about how Stitch handles specific scenarios, but here are a few of the most common ones:

Detailed replication logs and reports

You can keep an eye on Stitch’s replication progress using the Extraction Logs and Loading Reports. Located in every integration, the information provided on these pages shows:

While Stitch will notify you in the event of errors, the granularity of the Extraction Logs and Loading Reports may simplify your troubleshooting.

Enterprise plans include 60 days’ worth of extraction and loading data. You can find more info about Extraction Logs and Loading Reports in our documentation.

Handling of deleted records

Stitch is only able to identify soft deletes to source records. Unlike a hard delete, where the record is removed entirely from the source, a soft delete uses a specific column like deleted_at to indicate whether a record has been deleted.

If a record is hard deleted, Stitch will not detect the change, meaning that the record will remain in the data warehouse. If you need to account for hard deletes, you can queue a full re-replication of the integration or table.

Pinpoint troublesome records

From time to time, Stitch may run into problems when attempting to load data into your destination. Data may be rejected by a destination for a variety of reasons, including table names that exceed the supported character limit, integer values that fall outside the supported range, object naming collisions, and so on. Each destination handles data differently, so the reasons for rejection vary from warehouse to warehouse.

If Stitch is unable to load data due to destination incompatibilities, the rejection will be logged and a record created in a table named _sdc_rejected. This table acts as a log for a specific integration’s rejected records, and includes information about why a record was rejected, the date it occurred, and the name of the destination table.

While Stitch will notify you if a loading error occurs, the rejected record table can help you quickly pinpoint the root of the problem. You can find more information about resolving rejected record issues in our documentation.

← Previous Chapter

Destinations

Next Up ➔

Putting Stitch to Work

Part Four
Replication