When Stitch replicates your data, we’ll load it into the destination - or data warehouse - of your choosing. A data warehouse is a central repository of integrated data from disparate sources.
As we currently only allow you to connect one destination to your account, we recommend asking yourself the questions below before making your selection. By fully assessing each choice first, you’ll decrease the likelihood of needing to switch destinations or re-replicate all of your data at a later date.
- Does it support all (or most of) your data sources?
- Will the structure of the data replicated by Stitch work with how you plan to use it?
- Do you need a fully-managed solution, or can you perform maintenance tasks on your own?
- Does the provider’s pricing fall within your budget?
At the end of this article is a visual rollup of how each of Stitch’s destinations stack up against each other. This chart includes some supported limits (ex: length of table names), what task each destination excels at, and so on.
If you’re feeling overwhelmed or you’re unsure of what to look for, don’t worry. For a solid primer on data warehouses and setting the data strategy for your organization, check out our Data Strategy Guide.
Getting Started, Fast
If you simply want to try Stitch or Redshift or if you don’t have the ability to spin up a Redshift cluster of your own in AWS, we recommend trying Panoply. With just a few clicks, you create your own fully-managed Redshift data warehouse and start replicating data in minutes.
Keep in mind that switching to a different destination at any point will require a full re-sync of your data.
Integration & Destination Compatibility
Some integrations may be partially or fully incompatible with some of the destinations offered by Stitch. For example: some destinations don’t support storing multiple data types in the same column. If a SaaS integration sends over a column with mixed data types, some destinations may “reject” the data.
For integrations that allow you to control how data is structured, you may be able to fix the problem at the source and successfully replicate the data. If this is not possible, however, Stitch may never be able to successfully replicate the incompatible data.
While the majority of your data will look the same across our destinations, there are some key differences you should be aware of.
Nested Data Structures
If you decide to use a destination that doesn’t natively support nested structures - meaning Redshift, Panoply, or PostgreSQL - Stitch will de-nest these structures.
This means that Stitch will create subtables and more rows will count against your quota, depending on the level of nesting.
Google BigQuery and Snowflake, however, natively support nested structures. This means nested records will be stored intact in these destinations.
Check out the Handling of Nested Data & Row Count Impact for an in-depth look at what we mean by nested records, how Stitch handles nested data, and what those records will look like in your data warehouse.
BigQuery & Append-Only Replication
The current release of Stitch’s BigQuery destination uses Append-Only Incremental Replication.
For SaaS and database tables that use Incremental Replication, Stitch will replicate data into BigQuery in an append-only fashion. Data that updates existing an existing row will NOT overwrite the row. Instead, a new row with the new data will be appended to the end of the table.
This means that there can be many different rows in a BigQuery table with the same Primary Key, each representing what the data was at that moment in time. These are not duplicate rows - they’re “snapshots” of the record at different points.
For more info, check out this detailed explanation on Append-Only Replication or our recommendations for querying append-only tables.
Redshift vs. PostgreSQL
If you’ve worked with PostgreSQL in the past and are considering Redshift as your data warehouse, you should note that Redshift implements some Postgres features differently. In addition, some features, data types, and functions aren’t supported at all.
Setup & Maintenance
With the exception of a self-hosted PostgreSQL instance, all the destinations offered by Stitch are cloud-hosted databases, meaning you won’t have to factor in server maintenance when making a decision.
BigQuery, Heroku, Panoply, and Snowflake are fully-managed solutions that include routine maintenance and upgrades in their plans. Setting up Snowflake requires more technical know-how than the other aforementioned destinations.
Redshift, Amazon Postgres-RDS, and self-hosted Postgres instances will require you to perform and schedule maintenance tasks on your own. Spinning up a Redshift and Amazon Postgres-RDS instance will require technical knowledge and familiarity with the Amazon Web Services (AWS) console.
Every destination offered by Stitch has its own pricing structure. Some providers charge by overall usage, others by an hourly rate or the amount of stored data. Depending on your needs, some pricing structures may fit better into your budget.
In the next section, you’ll find each destination’s pricing structure, including a link to detailed price info and whether a free option (trial or plan) is available. Here are a few things to keep in mind:
BigQuery: BigQuery’s pricing isn’t based on a fixed rate, meaning your bill can vary over time. To learn more about how Stitch may impact your BigQuery costs, click here.
Panoply: Panoply charges based on the amount of data stored and offers several plan options for your needs. Their free option includes up to 10 million stored rows per month, which is based on the daily average of the number of rows stored, not loaded.
PostgreSQL: The software for hosting your own PostgreSQL instance is open-source, meaning it’s free. Heroku and Amazon RDS have a variety of plans to choose from.
Redshift: Currently, Redshift bases their pricing on an hourly rate that varies depending on the type and number of nodes in a cluster. The type and number of nodes you choose when creating a cluster is dependent on your needs and data set, but you can scale up or down over time should your requirements change.
Snowflake: Snowflake pricing is based on two factors: the volume of data stored in your Snowflake data warehouse and the number of compute hours used. Snowflake offers several plan options, some of which include on-demand pricing - meaning your bill could vary over time - or pre-purchased, price-secure capacity options.
|Release Status||Open Beta||Released||Released||Released||Released|
|Free Option||No||Yes (plan)||Yes (self-hosted)||Yes (plan & trial)||No|
|Connection Methods||SSL||SSL||SSH, SSL||SSH, SSL||SSL|
|Table Name Length||1,024||127||63||127||255|
|Column Name Length||128||115||59||115||251|
|Max # of Columns||10,000||1,600||250-1,600||1,600||None|
|Max VARCHAR Width||None||65K||None||65K||16MB|
|Nested Structures||Native Support||None||None||None||Native Support|
|Reserved Words||Full List||Full List||Full List||Full List||Full List|
Additional Resources & Setup Tutorials
Ready to pick a destination and get started? Use the links below to check out a more in-depth look at each destination or move on to the setup and connection process.
Overview | Setup
Overview | Setup
Overview | Setup
Overview | Setup
Overview | Setup
Did this article help? If you have questions or feedback, please reach out to us.