Much like the data part of a cell phone plan, each Stitch plan is allotted a certain number of replicated rows per month. For detailed info on pricing and what’s included in each plan, refer to the pricing page on our website.
In this guide, we’ll cover the basics of Stitch billing, how to check and understand your usage, and some tips for keeping your row count low.
Stitch counts the following as a ‘replicated row’:
- A new row, or a never-before-replicated row replicated through Stitch,
- An updated row, or an existing row that’s been changed,
- A sub-row created from de-nesting nested data structures, and
- A copy of an existing row. For example: Rows in tables that are replicated fully during each replication job or rows replicated as a result of resetting Replication Keys.
Check your usage in Stitch
On the Stitch Dashboard page, you can view the total number of replicated rows for all of your integrations for the current billing period:
The reset date - or the day your row count will reset to zero - can be found in the Plan Details section of your Billing page, accessed by clicking the User menu (your icon) > Billing.
Understand your usage
When viewing the number of replicated rows in Stitch, you may be surprised by the totals. You may ask yourself: “How did Stitch replicate this many rows? There aren’t that many in my source or destination!”
Keep in mind that the total reported by Stitch is the number of replicated rows. The number of rows Stitch replicates is directly impacted by:
The number of tables set to replicate. The more tables that you select, the potential higher number of rows.
The Replication Methods used by the replicating tables. Tables using Full Table Replication can increase your row usage.
Integrations’ replication schedules. Integrations scheduled to replicate on a frequent basis can lead to increased row usage.
The volume and structure of the data in the selected tables. Some Stitch destinations - like Redshift and PostgreSQL - will break apart nested records and count each sub-record as a row. Refer to the Nested data structures guide for more info and examples.
Because Stitch counts updated rows, copies of existing rows, and rows created from de-nesting towards your total usage, the total of replicated rows and the total number of rows in your data sources or destination may not be equal.
Reduce your usage
While you can change your plan at any time to accommodate changing volume needs, below are some tried-and-true tips for reducing your row usage and staying within your plan’s row allotment:
Reduce your integrations' Replication Frequency
Replication Frequency refers to how often - based on the time of the last completed attempt - Stitch will attempt to replicate data from a data source. Generally, the more often an integration is scheduled to replicate, the higher the number of rows Stitch replicates for the inetgration.
If you’re able to get by without the freshest data, consider changing your integrations’ Replication Frequency to something less frequent. For example: Every hour or six hours.
Keep in mind that the Replication Frequency setting applies to the entire integration, not individual tables. This is especially important if there are a lot of tables that use Full Table Replication in the integration.
Understand Stitch's Replication Methods
Stitch uses one of three Replication Methods to replicate data from your data sources. To keep your row usage down, Stitch recommends familiarizing yourself with each of these methods before selecting tables for replication. This will ensure you set Stitch up to accurately and efficiently replicate your data.
- Key-based Incremental Replication - Key-based Incremental Replication is a replication method in which Stitch identifies new and updated data using a column called a Replication Key.
- Log-based Incremental Replication - Log-based Incremental Replication is a replication method in which Stitch identifies modifications to records - including inserts, updates, and deletes - using a database’s binary log files.
- Full Table Replication - Full Table Replication is a replication method in which all rows in a table - including new, updated, and existing - are replicated during every replication job.
For integrations that allow you to configure Replication Methods for selected tables, Stitch recommends using an incremental method whenever possible. This can significantly reduce the amount of redundant data that’s replicated by Stitch.
Get to know your SaaS integrations
While we try to use Key-based Incremental Replication for SaaS integrations whenever possible, replicating high numbers of rows is sometimes unavoidable. This can be because:
- The integration generates massive amounts of data. Mixpanel, for example, typically contains large amounts of data.
- Some tables require Full Table Replication or querying for a time range (attribution window) during each replication job to ensure accuracy.
The integration contains nested data structures. If you’re using a destination that doesn’t natively support nested structures, Stitch will de-nest these structures and create sub-rows which will result in a higher number of replicated rows.
For an in-depth walkthrough of how JSON arrays are deconstructed in Stitch, as well as what arrays are in the first place, check out the Handling of Nested Data Structures & Row Count Impact guide.
To find out more about your SaaS integrations’ data structure and replication methods, we recommend checking out our extensive SaaS integration docs. Every SaaS integration has detailed info about the tables Stitch will replicate and the methods used to do so.
De-select unnecessary data
To keep your row count down and your destination tidy, you can also de-select any tables or columns you don’t need.
For example: If a column contains nested data, additional sub-rows may be created to accommodate loading the data to certain destination types. This will increase the total row count, as Stitch counts sub-rows towards usage. If this column is no longer needed, you could de-select it and lower your usage.
Note: This is only applicable to the integrations that support table and/or column selection.
If all else fails, you can temporarily pause the integration to keep from going over your row limit.
Note: Pausing an integration will only prevent the extraction of additional records. Loading will continue for records that have been extracted prior to the pause.
For example: If there are records currently in Preparing when an integration is paused, Stitch will continue to load these records, complete the current replication job, and count them towards your usage.