Microsoft Azure Synapse Analytics Destination Reference

Stitch only supports connecting to Azure Synapse Analytics instances

Stitch’s Microsoft Azure Synapse Analytics destination only works with Microsoft’s Microsoft Azure Synapse Analytics.

Stitch doesn’t currently support using Azure SQL Server or Azure SQL Database as a destination.

Microsoft Azure Synapse Analytics is a fast, fully-managed, petabyte-scale data warehouse. It’s ideal for batch-based data warehouse workloads, and designed with a decoupled storage and compute model that allows it to scale quickly and be maintained in a cost-effective way.

This guide serves as a reference for version 1 of Stitch’s Microsoft Azure Synapse Analytics destination.

Details and features

Stitch features
Destination details
Microsoft Azure Synapse Analytics pricing

Stitch features

High-level details about Stitch’s implementation of Microsoft Azure Synapse Analytics, such as supported connection methods, availability on Stitch plans, etc.

Release status	Released
Stitch plan availability	All Stitch plans
Stitch supported regions	North America (AWS us-east-1) Europe (AWS eu-central-1) Operating regions determine the location of the resources Stitch uses to process your data. Learn more.
Supported versions	Not applicable
Connect API availability	Supported This version of the Microsoft Azure Synapse Analytics destination can be created and managed using Stitch’s Connect API. Learn more.
SSH connections	Supported Stitch supports using SSH tunnels to connect to Microsoft Azure Synapse Analytics destinations.
SSL connections	Supported Stitch will attempt to use SSL to connect by default. No additional configuration is needed.
VPN connections	Unsupported Virtual Private Network (VPN) connections may be implemented as part of a Premium plan. Contact Stitch Sales for more info.
Static IP addresses	Supported This version of the Microsoft Azure Synapse Analytics destination has static IP addresses that can be whitelisted.
Default loading behavior	Upsert Note: Append-Only loading will be used if all conditions for Upsert are not met. Learn more.
Nested structure support	Unsupported Nested data structures will be flattened into relational objects. Learn more.

Destination details

Details about the destination, including object names, table and column limits, reserved keywords, etc.

Note: Exceeding the limits noted below will result in loading errors or rejected data.

Maximum record size	1MB This limit is imposed by PolyBase, which is required to load data into Microsoft Azure Synapse Analytics destinations.
Table name length	112 characters
Column name length	128 characters
Maximum columns per table	1,024
Maximum table size	60 TB compressed on disk
Maximum tables per database	10,000
Case sensitivity	Insensitive
Reserved keywords	Refer to the Reserved keywords documentation.

Microsoft Azure Synapse Analytics pricing

Microsoft Azure Synapse Analytics bases their pricing on your compute and storage usage. Compute usage is charged using an hourly rate, meaning you’ll only be billed for the hours your data warehouse is active. Compute usage is billed in one hour increments.

Storage charges include the size of your primary database and seven days of incremental snapshots. Microsoft Azure rounds charges to the nearest terabyte (TB). For example: If the data warehouse is 1.5 TB and you have 100 GB of snapshots, your bill will be for 2 TB of data.

Refer to Microsoft’s documentation for more info and examples.

Replication process overview

Step 1: Data extraction
Step 2: Stitch’s internal pipeline
Step 3: Load data into Azure Blob Storage
Step 4: Prep data using Polybase
Step 5: Insert data into the data warehouse

Step 1: Data extraction

Stitch requests and extracts data from a data source. Refer to the System overview guide for a more detailed explanation of the Extraction phase.

Step 2: Stitch's internal pipeline

The data extracted from sources is processed by Stitch. Stitch’s internal pipeline includes the Prepare and Load phases of the replication process:

Prepare: During this phase, the extracted data is buffered in Stitch’s durable, highly available internal data pipeline and readied for loading.
Load: During this phase, the prepared data is transformed to be compatible with the destination, and then loaded. Refer to the Transformations section for more info about the transformations Stitch performs for Microsoft Azure Synapse Analytics destinations.

Refer to the System overview guide for a more detailed explanation of these phases.

Step 3: Load data into Azure Blob Storage

Stitch loads the extracted data into Azure Blob Storage.

Blob storage is intended for storing massive amounts of unstructured data. In the next step, Stitch will use Polybase to retrieve the data from Blob Storage and prepare it for loading into Microsoft Azure Synapse Analytics.

Step 4: Prep data using Polybase

Note: Polybase has its own set of limitations that may make it impossible to load certain data. Refer to the Limitations section for more info.

Polybase is a Microsoft offering that integrates Microsoft SQL products with Hadoop. Polybase is needed to query data from Azure Blob Storage.

In this step, Stitch will perform the following:

Create an external data source. This creates an external data source for the Polybase queries Stitch will run.
Create an external file format. This creates an object that defines the external (extracted) data Stitch will load. This is used in the next step to create an external table.
Create an external table. Using the external file format, this will create an external table. The external table is used to stage the data from Azure blob storage and load it into your Microsoft Azure Synapse Analytics data warehouse.

Step 5: Insert data into the data warehouse

Lastly, Stitch will insert the data from the external table in Polybase into your Microsoft Azure Synapse Analytics data warehouse.

Loading behavior

By default, Stitch will use Upsert loading when loading data into Microsoft Azure Synapse Analytics.

If the conditions for Upsert loading aren’t met, data will be loaded using Append-Only loading.

Refer to the Understanding loading behavior guide for more info and examples.

Primary Keys

Stitch requires Primary Keys to de-dupe incrementally replicated data. To ensure Primary Key data is available, Stitch creates an _sdc_primary_keystable in every integration dataset. This table contains a list of all tables in an integration’s dataset and the columns those tables use as Primary Keys.

Refer to the _sdc_primary_keys table documentation for more info.

Note: Removing or altering this table can lead to replication issues.

Incompatible sources

No compatibility issues have been discovered between Microsoft Azure Synapse Analytics and Stitch's integration offerings.

See all destination and integration incompatibilities.

Transformations

System tables and columns
Data typing
JSON structures
Column names
Timezones

System tables and columns

Stitch will create the following tables in each integration’s dataset:

Additionally, Stitch will insert system columns (prepended with _sdc) into each table.

Data typing

Stitch converts data types only where needed to ensure the data is accepted by Microsoft Azure Synapse Analytics. In the table below are the data types Stitch supports for Microsoft Azure Synapse Analytics destinations, and the Stitch types they map to.

Stitch type: The Stitch data type the source type was mapped to. During the Extraction and Preparing phases, Stitch identifies the data type in the source and then maps it to a common Stitch data type.
Destination type: The destination-compatible data type the Stitch type maps to. This is the data type Stitch will use to store data in Microsoft Azure Synapse Analytics.
Notes: Details about the data type and/or its allowed values in the destination, if available. If a range is available, values that exceed the noted range will be rejected by Microsoft Azure Synapse Analytics.

Stitch type	Destination type	Notes
BIGINT	BIGINT	Range : -2^63 (-9,223,372,036,854,775,808) to 2^63-1 (9,223,372,036,854,775,807)
BOOLEAN	BOOLEAN	Description: Values are stored as `1` (`true`), `0` (`false`), or `NULL`.
DATE	DATETIMEOFFSET	Description: Stored in UTC as `DATETIMEOFFSET(3)` Range : Dates: 0001-01-01/January 1, 0001 AD, through 9999-12-31/December 31, 9999 AD; Time: 00:00:00 through 23:59:59.997
DOUBLE	FLOAT	Description: Stored as `FLOAT(53)`
FLOAT	FLOAT	Description: Stored as `FLOAT(53)`
INTEGER	BIGINT	Range : -2^63 (-9,223,372,036,854,775,808) to 2^63-1 (9,223,372,036,854,775,807)
NUMBER	DECIMAL	Description: Precision must be between 1 and 38; scale must be between 0 and the precision value Range : Precision must be between 1 and 38; scale must be between 0 and the precision value
STRING	NVARCHAR	Description: Columns will be automatically widened if the string length exceeds the current size. Note: String values that exceed the maximum width will be truncated. Range : Maximum width is 4,000 bytes

JSON structures

Microsoft Azure Synapse Analytics destinations don’t have native support for nested data structures. To ensure nested data can be loaded, Stitch will flatten objects and arrays into columns and subtables, respectively. For more info and examples, refer to the Handling nested data structures guide.

Column names

Column names in Microsoft Azure Synapse Analytics:

Must be less than the maximum length of 128 characters. Columns that exceed this limit will be rejected by Microsoft Azure Synapse Analytics.
Must not be prefixed or suffixed with any of Microsoft Azure Synapse Analytics’s or Stitch’s reserved keyword prefixes or suffixes

Stitch will perform the following transformations to ensure column names adhere to the rules imposed by Microsoft Azure Synapse Analytics:

Transformation	Source column	Destination column
Convert spaces to underscores	`customer id`	`customer_id`
Convert special characters to underscores	`customer#id` or `!CuStOMerID`	`customer_id` and `_CuStOMerID`
Convert leading numbers to underscores	`4customerid`	`_customerid`

Timezones

Microsoft Azure Synapse Analytics will store the value in UTC with the specified offset.

Compare destinations

Not sure if Microsoft Azure Synapse Analytics is the destinationfor you? Check out the Choosing a Stitch Destination guide to compare each of Stitch’s destination offerings.

Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.

Related	Troubleshooting
Choosing a Destination Destination & Integration Compatibility Loading Data into Your Destination Switching Destinations	Destination Connection Errors