In this guide, we’ll cover the basics of Stitch billing, how to check and understand your usage, and some tips for keeping your row count low.

In this guide, we’ll cover:


Stitch billing basics

How does Stitch billing work?

Much like the data part of a cell phone plan, each Stitch plan is allotted a certain number of replicated rows per month. For detailed info on pricing and what’s included in each plan, refer to the pricing page on our website.

What's a replicated row?

Stitch counts the following as a ‘replicated row’:

  • A new row, or a never-before-replicated row replicated through Stitch,
  • An updated row, or an existing row that’s been changed,
  • A sub-row created from de-nesting nested data structures, and
  • A copy of an existing row. For example: Rows in tables that are replicated fully during each replication job or rows replicated as a result of resetting Replication Keys.

Understand your usage

Source rows ≠ replicated rows

When viewing the number of replicated rows in Stitch, you may be surprised by the totals. You may ask yourself:

“How did Stitch replicate this many rows? There aren’t that many in my source or destination!”

We understand that this can be confusing. Keep in mind that row usage in Stitch is the total number of replicated rows. This means that the number of rows in the source won’t necessarily be equal to your row usage in Stitch.

Because Stitch counts updated rows, copies of existing rows, and rows created from de-nesting towards your total usage, the total of replicated rows and the total number of rows in your data sources or destination may not be equal.

Take the following example. As you can see in the second replication job, the total replicated rows reported by Stitch are cumulative, or the total rows replicated across all replication jobs.

Example demonstrating how source, Stitch-reported, and destination rows are not always equal.

Click to enlarge.

Impacts on usage

The number of rows Stitch replicates is directly impacted by:

  1. The number of tables set to replicate. The more tables that you select, the potential higher number of rows.

  2. The Replication Methods used by the replicating tables. Tables using Full Table Replication can increase your row usage.

  3. Integrations’ replication schedules. Integrations scheduled to replicate on a frequent basis can lead to increased row usage.

  4. The volume and structure of the data in the selected tables. Some Stitch destinations - like Redshift and PostgreSQL - will break apart nested records and count each sub-record as a row. Refer to the Nested data structures guide for more info and examples.

Usage examples

Next, we’ll look at some examples of how certain factors can affect row usage in Stitch.

When looking at the examples, note the differences between the source rows and row totals reported in Stitch.

In each tab is an example of how certain factors can affect row usage in Stitch.

For these examples, assume that:

  • The tables are using Full Table Replication. This means the table is replicated in full during every replication job.
  • That Extraction has completed without issue for every scheduled replication job.

In this example, we’ll look at how different replication frequencies can affect the total number of replicated rows.

Below are the total number of replicated rows for a table with 100 rows using Full Table Replication:

Replication Frequency 30 minutes 1 hour 6 hours 12 hours 24 hours
Daily total 4,800 2,400 400 200 100
Billing period total 144,000 72,000 12,000 6,000 3,000

As you can see, slightly reducing the Replication Frequency can greatly reduce the number of replicated rows overall.

While this example only demonstrates row usage for a single table, think about how row usage will increase when there are multiple tables like this one set to replicate.

Note: This example is applicable only to destinations that don’t natively support nested data, such as Amazon Redshift or PostgreSQL-based destinations.

In this example, we’ll look at how data structured using JSON arrays can affect the total number of replicated rows.

For destinations that don’t natively support storing nested data, Stitch will “de-nest”, or normalize, complex JSON structures into relations. For JSON arrays, data is unpacked into subtables, where each sub-record is counted as a replicated row.

Source record in JSON format

When Stitch extracts data from a source, it's done by putting the data into JSON format. Below is a sample record from a table named people. Note the best_friends array:

{
   "id":1,
   "name":"Finn",
   "type":"human",
   "best_friends":[
      {
         "id":2,
         "name":"Jake",
         "type":"dog"
      },
      {
         "id":3,
         "name":"Bubblegum",
         "type":"princess"
      },
      {
         "id":4,
         "name":"BMO",
         "type":"robot"
      }
   ]
}

When Stitch loads this record into the destination, two tables will be created: people and people__best_friends. This is due to Stitch unpacking the JSON arrays.

Top-level: people

The top-level table will contain a single record:

id name type
1 Finn human
Subtable: people__best_friends

The subtable will contain three records, one for each item in the best_friends array:

_sdc_source_key_id _sdc_level_0_id id name type
1 0 2 Jake dog
1 1 3 Bubblegum princess
1 2 4 BMO robot
Row usage totals

This example assumes the table is using Full Table Replication. Below are the total number of replicated rows for the people (one record) and people__best_friends (three records) tables:

30 minutes 1 hour 6 hours 12 hours 24 hours
Daily total 192 96 16 8 4
Billing period total 5,760 2,880 480 240 120

This example demonstrates that while there is only one record in the source, the number of rows replicated and loaded through Stitch will be four due to de-nesting.

While the example totals may not appear to be significant, think about this as it might relate to real data. Tables can contain dozens or hundreds of records, which will exponentially increase overall row usage.


Check your row usage

Row usage for all integrations

On the Stitch Dashboard page, you can view the total number of replicated rows for all of your integrations for the current billing period:

Row usage graph on the Stitch Dashboard page for all integrations

Row usage for an integration

To take a closer look at an individual integration’s usage for the current billing period, click on the integration to open the Integration Details page, and check out the Rows Loaded Over Time section:

Rows loaded over time for an integration

Row usage reset date

The reset date - or the day your row count will reset to zero - can be found in the Your Usage section of your Billing page, accessed by clicking the User menu (your icon) > Billing:

Row usage reset date in the Your Usage section of the Stitch Billing page


Reduce your usage

While you can change your plan at any time to accommodate your data volume needs, below are some tried-and-true tips for reducing your row usage and staying within your plan’s row allotment:

Identify high usage integrations

For many of Stitch’s integrations, row usage shouldn’t be an issue. We attempt to use Key-based Incremental Replication for SaaS integrations whenever possible.

There are, however, times when high row usage may be unavoidable. For example:

  • Data contains many nested structures. This is applicable to row usage only when a destination doesn’t natively support nested JSON structures.

  • Source generates large amounts of data

  • Integration has a high number of tables using Full Table Replication

  • Integration doesn’t currently support table selection

  • Integration uses an for extraction

Click below to display integrations known to be heavy row users, and the potential reasons for their increased usage. If you’re using any of these integrations, you can use the remaining tips in this section to keep your usage down.

To find out more about your SaaS integrations' data structure and replication methods, we recommend checking out our extensive SaaS integration docs. Every SaaS integration has detailed info about the tables Stitch will replicate and the methods used to do so.

Integration Reasons
Any database
  • High data volume

Bing Ads
Club Speed
Codat
Desk
  • No table selection

Facebook Ads
Google Ads
Google Analytics
Google Analytics (AdWords)
Google ECommerce
HubSpot
Intercom
JIRA
Listrak
LivePerson
Mixpanel
MongoDB
MongoDB
MongoDB Atlas
NetSuite
Pardot
  • No table selection

QuickBooks
  • No table selection

Shopify
Square
Stripe
Trello
Xero
Zendesk Support

Reduce Replication Frequencies

Generally, the more often an integration is scheduled to replicate, the higher the number of rows Stitch replicates for the inetgration.

If you’re able to get by without the freshest data, consider changing your integrations’ Replication Frequency to something less frequent. For example: Every hour or six hours.

Keep in mind that the Replication Frequency setting applies to the entire integration, not individual tables. This is especially important if there are a lot of tables that use Full Table Replication in the integration.

Use an incremental Replication Method

For integrations that support Replication Method configuration, we recommend using either Key-based or Log-based Incremental Replication whenever possible.

De-select unnecessary data

Note: This is only applicable to the integrations that support table and/or column selection.

To keep your row count down and your destination tidy, you can also de-select any tables or columns you don’t need.

For example: If a column contains nested data, additional sub-rows may be created to accommodate loading the data to certain destination types. This will increase the total row count, as Stitch counts sub-rows towards usage. If this column is no longer needed, you could de-select it and lower your usage.

Pause integrations

If all else fails, you can temporarily pause the integration to keep from going over your row limit.

Note: Pausing an integration will only prevent the extraction of additional records. Loading will continue for records that have been extracted prior to the pause.

For example: If there are records currently in Preparing when an integration is paused, Stitch will continue to load these records, complete the current replication job, and count them towards your usage.


Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.