Snowflake is a SQL data warehouse built from the ground up for the cloud, designed with a patented new architecture to handle today’s and tomorrow’s data and analytics.

A fully-managed SaaS data warehouse solution, Snowflake runs on Amazon Web Services cloud infrastructure: AWS EC2 virtual compute instances are used for compute needs, while S3 is utilized for persistent data storage.

This guide serves as a reference for version 1 of Stitch’s Snowflake destination.


Details and features

Stitch features

High-level details about Stitch’s implementation of Snowflake, such as supported connection methods, availability on Stitch plans, etc.

Release status

Released

Stitch plan availability

All Stitch plans

Supported versions

Not applicable

Connect API availability Supported

This version of the Snowflake destination can be created and managed using Stitch’s Connect API. Learn more.

SSH connections Unsupported

Stitch does not support using SSH tunnels to connect to Snowflake destinations.

SSL connections Supported

Stitch will attempt to use SSL to connect by default. No additional configuration is needed.

VPN connections Unsupported

Virtual Private Network (VPN) connections may be implemented as part of an Enterprise plan. Contact Stitch Sales for more info.

Default loading behavior

Upsert
Note: Append-Only loading will be used if all conditions for Upsert are not met. Learn more.

Nested structure support

Supported
Nested data structures will be loaded intact into VARIANT columns. Learn more.

Destination details

Details about the destination, including object names, table and column limits, reserved keywords, etc.

Note: Exceeding the limits noted below will result in loading errors or rejected data.

Maximum record size

4MB

Table name length

255 characters

Column name length

251 characters

Maximum table size

None

Maximum tables per database

100,000

Case sensitivity

Insensitive

Reserved keywords

Refer to the Reserved keywords documentation.

Snowflake pricing

Snowflake pricing is based on two factors: The volume or data stored in your Snowflake destination and the amount of compute usage (the time the server runs) in seconds.

Snowflake offers two types of plans, each with varying levels of access and features. There are On Demand plans which are commitment-free and usage-based. The alternative is a Capacity option, which guarantees secure price discounts. Learn more about Snowflake plans and pricing here.

Snowflake warehouse sizes

Snowflake data warehouses can be different sizes - X-Small, Large, and 3X-Large, for example - which defines how many servers will comprise each cluster in a warehouse.

While the size of a warehouse can impact the time required to execute queries, bigger doesn’t always mean better. Warehouse size is directly tied to the number of credits used, which will directly impact your Snowflake costs. Learn more about Snowflake warehouse sizes here.

To help you select the warehouse size that fits your needs and budget, check out Snowflake’s Warehouse Considerations guide before getting started.

Automated warehouse management

To reduce usage, you can elect to automate the management of your Snowflake warehouse. This means that you can elect to suspend the warehouse when there’s no activity after a specified period of time, and then automatically resume when there is. Note that these settings apply to the entire warehouse and not individual clusters.

Enabling these settings depends on your workload and availability needs. Learn more about the Auto Suspend and Auto Resume features here.

Note: Stitch will only ever impact your Snowflake usage when loading data.


Replication

Replication process overview

A Stitch replication job consists of three stages:

Step 1: Data extraction

Stitch requests and extracts data from a data source. Refer to the System overview guide for a more detailed explanation of the Extraction phase.

Step 2: Preparation

During this phase, the extracted data is buffered in Stitch’s durable, highly available internal data pipeline and readied for loading. Refer to the System overview guide for a more detailed explanation of the Preparation phase.

Step 3: Loading

Stitch loads the data into Snowflake.

Loading behavior

By default, Stitch will use Upsert loading when loading data into Snowflake.

If the conditions for Upsert loading aren’t met, data will be loaded using Append-Only loading.

Refer to the Understanding loading behavior guide for more info and examples.

Primary Keys

Stitch requires Primary Keys to de-dupe incrementally replicated data. When tables are created in the destination, Stitch will apply Primary Key constraints to columns used as Primary Keys. Primary Key constraints require that column values be unique and not null.

Incompatible sources

No compatibility issues have been discovered between Snowflake and Stitch's integration offerings.

See all destination and integration incompatibilities.


Transformations

System tables and columns

Stitch will create the following tables in each integration’s dataset:

Additionally, Stitch will insert system columns (prepended with _sdc) into each table.

Data typing

Stitch converts data types only where needed to ensure the data is accepted by Snowflake. In the table below are the data types Stitch supports for Snowflake destinations, and the Stitch types they map to.

  • Stitch type: The Stitch data type the source type was mapped to. During the Extraction and Preparing phases, Stitch identifies the data type in the source and then maps it to a common Stitch data type.
  • Destination type: The destination-compatible data type the Stitch type maps to. This is the data type Stitch will use to store data in Snowflake.
  • Notes: Details about the data type and/or its allowed values in the destination, if available. If a range is available, values that exceed the noted range will be rejected by Snowflake.
Stitch type Destination type Notes
BIGINT NUMBER
  • Description: Stored as NUMBER(38,0)

  • Range : Limited to 38 digits of scale

BOOLEAN BOOLEAN
DATE TIMESTAMP_TZ
  • Description: Stored in UTC as TIMESTAMP_TZ(9)

DOUBLE FLOAT
FLOAT FLOAT
INTEGER NUMBER
  • Description: Stored as NUMBER(38,0)

  • Range : Limited to 38 digits of scale

NUMBER DECIMAL
STRING VARCHAR
  • Description: Stored as VARCHAR(16777216), which is the maximum length supported by Snowflake. No widening will occur if the string length changes.

JSON structures

When Stitch replicates source data containing objects or arrays, Stitch will load the data intact into a VARIANT column. This is a Snowflake data type that can contain semi-structured data like JSON arrays and objects.

You can then use Snowflake’s functions for semi-structured data to parse the data. Refer to Snowflake’s documentation for more info.

Column names

Column names in Snowflake:

Stitch will perform the following transformations to ensure column names adhere to the rules imposed by Snowflake:

Transformation Source column Destination column
Convert lowercase and mixed case to uppercase CuStOmErId or customerid CUSTOMERID
Remove spaces CUSTOMER ID CUSTOMERID
Remove special characters CUSTOMER#ID or !CUSTOMERID CUSTOMERID and CUSTOMERID
Remove non-letter leading characters 4CUSTOMERID or _CUSTOMERID CUSTOMERID

Timezones

Snowflake will store the value as TIMESTAMP_TZ(9) and express it as UTC.


Compare destinations

Not sure if Snowflake is the data warehouse for you? Check out the Choosing a Stitch Destination guide to compare each of Stitch’s destination offerings.


Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.