This guide describes how to connect Azure SQL Data Warehouse to Stitch using an SSH tunnel.

If your Azure SQL Data Warehouse instance is private, you can create a virtual machine to serve as an SSH bastion. This publicly accessible instance will act as an intermediary, forwarding the traffic from Stitch through an encrypted tunnel to your private Azure SQL Data Warehouse instance.

Microsoft Azure Setup Requirements

To set up Microsoft Azure in Stitch, you need:

  • Some familiarity with Linux and the command line. In this tutorial, you’ll create a Linux user for Stitch to ensure access via SSH. While we’ve provided the commands you’ll need to create the user, you should know how to access a server using the command line and feel comfortable running commands.

  • An up-and-running Azure SQL Data Warehouse instance. Instructions for creating a Azure SQL Data Warehouse data warehouse are outside the scope of this tutorial; our instructions assume that you have an instance up and running. For help getting started with Azure SQL Data Warehouse, refer to Microsoft’s documentation.

  • An existing Azure Storage account. Instructions for creating an Azure Storage account are outside the scope of this tutorial. For help getting started with an Azure Storage account, refer to Microsoft’s documentation.


Step 1: Generate storage credentials

In this step, you’ll generate the storage credentials required to access Azure Storage. Stitch uses Azure Blob Storage to stage data for Polybase before it’s finally loaded into your Azure SQL Data Warehouse.

Step 1.1: Generate a shared access signature URL

  1. In the sidenav, click Storage accounts.
  2. On the page that displays, click the name of the storage account you want to use.
  3. The middle panel contains the Storage account menu. In the Settings section, click Shared access signature.
  4. Leave the boxes for the Allowed services, Allowed resource types, and Allowed permissions checked. Because Microsoft doesn’t currently allow users to individually grant permissions on services and resources, Stitch currently requires the default configuration, which includes all permissions. Refer to Microsoft’s documentation for more info.

    Note: Despite the permissions granted here, Stitch will never delete data.

  5. Next, you’ll define the access period using the Start and End fields in the Start and expiry date/time section:

    Start and end dates for Shared Access Signature generation in Microsoft Azure settings

    To reduce the likelihood of replication interruption, we recommend setting the expiration date well into the future.

  6. In the Allowed protocols section, select HTTPS only.
  7. For the Signing key, select the name of the key you want to use. Note: Stitch doesn’t require the use of a specific key.
  8. When finished, click the Generate SAS and connection string button.

  9. A handful of fields will appear below the button. Locate the Blob service SAS URL field:

    The Blob Service SAS URL field, highlighted

  10. Copy the URL to somewhere handy - you’ll need it to complete the setup.

Step 1.2: Retrieve your storage access key

  1. In the middle panel menu, click Access keys in the Settings section.
  2. On the page that displays, locate the section for the Signing key you selected in Step 7 of the previous section. For example: If you selected key1 as the signing key, locate the section for key 1.
  3. Copy the value in the corresponding Key field to somewhere handy - you’ll need it to complete the setup.

    The Key fields are highlighted for both keys in the image below. Remember that you’ll only need the Key field value for the signing key you selected:

    Highlighted Key fields for the Storage Access Keys in Microsoft Azure


Step 2: Create and configure a virtual machine

Next, you’ll create a virtual machine to serve as the SSH bastion. This publicly accessible instance will act as an intermediary, forwarding the traffic from Stitch through an encrypted tunnel to your private Azure SQL Data Warehouse instance.

Step 2.1: Launch the virtual machine

  1. In the sidenav, click Virtual machines.
  2. On the Virtual machines page, click the + Add button.
  3. Fill in the fields in the tabs. In the list below are the fields that require specific values for Stitch:
    • Basics tab:
      • Public inbound ports: Select Allow selected ports.
      • Select inbound ports: Select SSH (22).
  4. After you’ve finished filling in the required fields, click Review + create.
  5. Click Create to create the virutal machine.

After you click Create, Azure will launch the deployment process for the virtual machine. This may take a few minutes to complete.

Step 2.2: Configure the virtual machine's inbound access

After Azure finishes deploying the virtual machine, you can move onto configuring the access rules for its security group. Inbound access rules will allow traffic from Stitch’s IP addresses to access the virtual machine.

  1. In the sidenav, click Virtual machines.
  2. On the Virtual machines page, click virtual machine you created in the previous step.
  3. The details page for the virtual machine will display. In the middle menu, click Networking.
  4. In the Inbound Port Rules section, click the Add inbound port rule button.
  5. In the Add inbound security rule tab that displays, fill in the fields as follows. If a field isn’t in this list, use the default value:
    • Source: Select IP Addresses.
    • Source IP addresses/CIDR ranges: Paste this comma-delimited list of Stitch’s IP addresses:

        52.23.137.21/32,52.204.223.208/32,52.204.228.32/32,52.204.230.227/32
      
    • Protocol: Select TCP.
    • Action: Select Allow.
    • Name: Enter stitch.
  6. When finished, click Add to create the inbound rule.

Step 3: Enable the virtual machine to access the Azure SQL Data Warehouse instance

Next, you’ll configure the Azure SQL Data Warehouse database to allow traffic forwarded from the virtual machine to access the server. This is accomplished by whitelisting the virtual machine’s public IP address in the server’s firewall settings.

Step 3.1: Retrieve the virtual server's public IP address

You should still be on the Networking page for the virtual machine - if not, navigate there before proceeding.

Locate the Public IP field, highlighted in the image below:

Virtual machine public IP address field

Keep this handy - you’ll need it in the next step and to complete the setup in Stitch.

Step 3.2: Create a server firewall rule for the virtual machine

  1. In the sidenav, click SQL databases.
  2. On the page that displays, click the name of the database you want to connect to Stitch.
  3. The details page for the database will display. Click the link in the Server name field.
  4. The details page for the server will display. Click the Show firewall settings link in the Firewalls and virutal networks field.

  5. Create a rule for the virutal machine’s IP address:
    • Rule name: Enter a name for the rule. For example: Stitch VM
    • Start IP: Paste the virtual machine’s public IP address.
    • End IP: Paste the virtual machine’s public IP address again.

    Below is what the rule should look like when finished:

    IP rule for a virtual machine

  6. Click Save.

Step 4: Create a Stitch Linux user on the virtual machine

Next, you’ll retrieve your Public Key and create a Linux user on the virtual machine for Stitch. This will create an authenticated user for Stitch, ensuring access to the virtual machine.

Step 4.1: Retrieve your Public Key

The Stitch Public Key

The Public Key is used to authorize the Stitch Linux user. If the key isn’t properly installed, Stitch will be unable to access your database.

To retrieve the key:

  1. Sign into your Stitch account.

  2. Click the Destination tab.
  3. Click the Azure SQL Data Warehouse icon.

  4. When the credentials page displays, click the Encryption Type menu and select the SSH Tunnel option.

  5. The Public Key will display, along with the other SSH fields.

Leave this page open for now - you’ll need it to wrap things up at the end.

Step 4.2: Create the Stitch Linux user

In this step, you’ll create the Linux user for Stitch. You’ll need to sign into the virtual machine as the root user before proceeding.

  1. Run the following commands as root on your Linux server to create a user named stitch:

    adduser --disabled-password stitch
    mkdir /home/stitch/.ssh
    
  2. Next, import the Public Key into authorized_keys, replacing [PASTE KEY HERE] with the Stitch Public Key:

    "[PASTE KEY HERE]" >> /home/stitch/.ssh/authorized_keys
    
  3. Alter the permissions on the /home/stitch directory to allow access via SSH:

    chown -R [stitch_username]:stitch /home/stitch
    chmod -R 700 /home/stitch/.ssh
    

Step 5: Create a Azure SQL Data Warehouse Stitch user

In this step, you’ll create a dedicated database user for Stitch. Creating a user for Stitch ensures that Stitch will be visible in any audits or logs, and that you can control the permissions granted to the user.

  1. If you haven’t already, connect to your Microsoft Azure instance using your SQL client.

  2. Navigate to the master database.

  3. In the master database, run the following commands to create a login and a user named stitch. Replace <STRONG_PASSWORD_HERE> with a strong password:

    CREATE LOGIN stitch WITH PASSWORD = '<STRONG_PASSWORD_HERE>';
    CREATE USER stitch FOR LOGIN stitch;
  4. Next, navigate to the database where you want Stitch to load data.

  5. In this database, run the following commands to create a login and a database user named stitch. Replace <DATABASE_NAME> with the name of the database:

    CREATE USER stitch FOR LOGIN stitch;
    GRANT CONTROL ON DATABASE::<DATABASE_NAME> to stitch;

In the table below are the database user privileges Stitch requires to connect to and load data into Microsoft Azure.

Privilege name Reason for requirement
CONTROL

Required to create the necessary database objects to load and store your data.

The CONTROL permission grants ownership-like capabilities to the Stitch user. This ensures that Stitch has all the necessary permissions to load data into the specified database, including ownership abilities on all schemas in the database, and all objects within all schemas in the database.


Step 6: Connect Stitch

To complete the setup, you need to enter your Azure SQL Data Warehouse connection details into the Destination Settings page in Stitch.

Step 6.1: Locate Azure SQL Data Warehouse connection details

In this step, you’ll retrieve the server address for the Azure SQL Data Warehouse you want to connect to Stitch.

This is the value you’ll enter in the Host field in Stitch in the next step.

  1. From your Azure dashboard, click SQL databases.
  2. On the page that displays, click the name of the database you want to connect to Stitch.
  3. The details page for the database will display. On this page, locate the Server name field, which is highlighted below:

    The Server name field for an Azure database, highlighted

Step 6.2: Enter connection details into Stitch

  1. If you aren’t signed into your Stitch account, sign in now.
  2. Click the Destination tab.
  3. Click the Azure SQL Data Warehouse icon.
  4. Fill in the fields as follows:

    • Host (Endpoint): Enter the host address (server name) used by the Azure SQL Data Warehouse instance. This is the Server name value you retrieved in the previous step.

    • Port: Enter the port used by the Azure SQL Data Warehouse instance. The default is 1433.

    • Username: Enter the Stitch Azure SQL Data Warehouse user’s username.

    • Password: Enter the password associated with the Stitch Azure SQL Data Warehouse user.

    • Database: Enter the name of the database in your Azure SQL Data Warehouse data warehouse you want to connect to Stitch.

    • Shared Access Signature URL: Paste the Blob service SAS URL you generated.

    • Access Key for Azure Storage: Paste the Key associated with the signing key you selected.

If you’re using an SSH Tunnel to connect, you’ll also need to fill in the SSH fields.

  1. Click the Encryption Type menu.
  2. Select SSH Tunnel.
  3. The following fields will display - fill them in as follows:

    • Remote Address: Enter the IP address of the virtual machine. This is the public IP address you retrieved in Step 3.1.
    • SSH Port: Enter the port used by the virtual machine. This should be 22.
    • SSH User: Enter the Stitch Linux (SSH) user’s username.

When finished, click Update Azure SQL Data Warehouse Settings.

Stitch will perform a connection test to the Azure SQL Data Warehouse database; if successful, a Success! mesage will display at the top of the screen. Note: This test may take a few minutes to complete.


Troubleshooting connection errors

If you receive an error message, we recommend trying these troubleshooting steps for common destination connection problems before reaching out to support.

We’ve found that the majority of the time, these resources can resolve most issues.


Learning about Stitch and Azure SQL Data Warehouse

Now that your Azure SQL Data Warehouse data warehouse is up and running, we recommend learning about how Stitch will load your data into it.


Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.

Tags: