MongoDB snapshot

A high-level look at Stitch's MongoDB integration, including release status, useful links, and the features supported in Stitch.

STITCH
Release Status

Released

Supported By

Stitch

Stitch Plan

Paid

Supported Versions

2.4 through 3.4

CONNECTION METHODS
SSH Connections

Supported

SSL Connections

Supported

REPLICATION SETTINGS
Anchor Scheduling

Unsupported

Table-level Reset

Unsupported

Configurable Replication Methods

Unsupported

REPLICATION METHODS
Log-based Replication

Unsupported

Key-based Replication

Supported

Full Table Replication

Unsupported

DATA SELECTION
Table Selection

Supported

Column Selection

Unsupported

View Replication

Unsupported

TRANSPARENCY
Extraction Logs

Unsupported

Loading Reports

Supported

Connecting MongoDB

MongoDB setup requirements

To set up MongoDB in Stitch, you need:

  • A paid Stitch plan. While those currently in the Free Trial will also be able to set up MongoDB, replication will be paused until a paid plan is selected after the trial ends.
  • Permissions in MongoDB that allow you to create/manage users. This is required to create the Stitch database user.

  • A MongoDB server that uses Auth mode. Auth mode requires every user who connects to Mongo to have a username and password. These credentials must be validated before the user will be granted access to the database.

  • To be using MongoDB version 2.4 through 3.4. While older versions may be connected to Stitch, we may not be able to provide support for issues that arise due to unsupported versions.

    We recommend always keeping your version current as a best-practice. If you encounter connection issues or other unexpected behavior, verify that your MongoDB version is one supported by Stitch.

Additionally, note that:

  • If using SSL, your server must require SSL connections. Note: SSL is not required to connect a MongoDB database to Stitch.
  • If connecting via Atlas, Stitch can only connect to instances using a paid Atlas plan with a dedicated cluster. The Free Atlas plan and shared clusters utilize a setup that Stitch doesn’t currently support.

Step 1: Index Replication Key Fields

Before you jump into the actual setup, consider how the documents in your Mongo database are updated.

Our Mongo integration uses Incremental Replication to replicate Mongo data, which means that only new and updated data will be replicated to your data warehouse when a sync runs. Stitch uses a field you designate - called a Replication Key - to identify new and updated data.

There are two requirements for Mongo Replication Keys:

  1. The field must be indexed. Only indexed fields will display in the Replication Key drop-down.
  2. The field must exist in the root of the document.

Additionally, while this is not a strict requirement, Replication Key fields should only contain a single, auto-incrementing data type. If a field contains multiple data types or a data type that doesn’t auto-increment, Stitch may have issues with detecting new/updated data.

For a detailed look at Mongo Replication Keys, check out the Selecting & Changing Mongo Replication Keys guide before continuing.


Step 2: Whitelist Stitch's IP addresses

For the connection to be successful, you’ll need to configure your firewall to allow access from our IP addresses. Whitelist the following IPs before continuing onto the next step:

  • 52.23.137.21/32

  • 52.204.223.208/32

  • 52.204.228.32/32

  • 52.204.230.227/32


Step 3: Retrieve your Stitch public key

The Stitch Public Key

The Public Key is used to authorize the Stitch Linux user. If the key isn’t properly installed, Stitch will be unable to access your database.

To retrieve the key:

  1. Sign into your Stitch account.

  2. On the Stitch Dashboard page, click the Add Integration button.

  3. Click the MongoDB icon.

  4. When the credentials page displays, click the Encryption Type menu and select the SSH Tunnel option.

  5. The Public Key will display, along with the other SSH fields.

Leave this page open for now - you’ll need it to wrap things up at the end.


Step 4: Create a Stitch Linux user

  1. Run the following commands as root on your Linux server to create a user named stitch:

    adduser --disabled-password stitch
    mkdir /home/stitch/.ssh
    
  2. Next, import the Public Key into authorized_keys, replacing [PASTE KEY HERE] with the Stitch Public Key:

    echo "[PASTE KEY HERE]" >> /home/[stitch_username]/.ssh/authorized_keys
    
  3. Alter the permissions on the /home/stitch directory to allow access via SSH:

    chown -R [stitch_username]:stitch /home/stitch
    chmod -R 700 /home/stitch/.ssh
    

Step 5: Create a Stitch database user

To successfully connect and replicate your Mongo data, Stitch requires the ability to:

  • Run the listDatabases command. This permission is required so Stitch can detect the databases available for replication.
  • Run the listIndexes command. Because Stitch will only display indexed fields as Replication Key options, this permission is required to identify fields that can be used as Replication Keys.
  • COUNT and query on all the databases you want to replicate data from. These permissions are requird to replicate your data.
  • Run the dbVersion command. While this isn’t mandatory, it’s beneficial for Stitch to have access to the information this command yields to troubleshoot any connection or replication issues that may arise.

You can assign a role to the Stitch user if you like, as long as the role has the necessary permissions to perform the actions listed above.

When connecting to multiple databases, you can add the user by logging into Mongo as an admin user and running the following command. This example uses createUser, but older versions may use addUser. Documentation for addUser can be found here.

Replace [authentication_database] with the name of database where the user is authenticated, or created:

use [authentication_database]
db.createUser( {  user: "[stitch_username]",
                  pwd: "[secure password here]",
                  roles: ["roles here", "if you want them"]
               }
             )

Note: For Atlas-based instances, the authentication_database will be admin.


Step 6: Connect Stitch

In this step, you’ll complete the setup by entering the database’s connection details and defining replication settings in Stitch.

Step 6.1: Define the database connection details

  1. Sign into your Stitch account, if you haven’t already.
  2. On the Stitch Dashboard page, click the Add Integration button.
  3. Click the MongoDB icon.
  4. Fill in the fields as follows:

    • Integration Name: Enter a name for the integration. This is the name that will display on the Stitch Dashboard for the integration; it’ll also be used to create the schema in your data warehouse.

      For example, the name “Stitch MongoDB” would create a schema called stitch_mongodb in the data warehouse. Note: The schema name cannot be changed after the integration is saved.

    • Host (Endpoint): Enter the host address (endpoint) used by the MongoDB instance.

      In general, this will be 127.0.0.1 (localhost), but could also be some other network address (ex: 192.68.0.1) or your server’s public IP address. Note: This must be the actual address - entering localhost into this field will cause connection issues.

    • Port: Enter the port used by the MongoDB instance. The default is 27017.

    • Username: Enter the Stitch MongoDB database user’s username.

    • Password: Enter the password for the Stitch database user.

    • Database: Enter the name of the MongoDB database where the Stitch user is to be authenticated. Stitch will ‘find’ all the databases you gave the Stitch user access to - this is needed only to complete the connection.

      Note: If you’re connecting an Atlas-based MongoDB instance, this must be the admin database. See the Create a Mongo database user section for more info on this requirement.

Step 6.2: Define the SSH connection details

If you’re using an SSH tunnel to connect your MongoDB database to Stitch, you’ll also need to complete the following:

  1. Click the Encryption Type menu.
  2. Select SSH Tunnel to display the SSH fields.

  3. Fill in the fields as follows:

    • Remote Address: Enter the IP address or hostname of the server Stitch will SSH into.

    • SSH Port: Enter the SSH port on your server. (22 by default)

    • SSH User: Enter the Stitch Linux (SSH) user’s username.

Step 6.3: Define the SSL connection details

Click the Connect using SSL checkbox if you’re using an SSL connection. Note: The database must support and allow SSL connections for this setting to work correctly.


Step 7: Create a replication schedule

In the Replication Frequency section, you’ll create the integration’s replication schedule. An integration’s replication schedule determines how often Stitch runs a replication job, and the time that job begins.

Stitch offers two methods of creating a replication schedule:

  • Replication Frequency: This method requires selecting the interval you want replication to run for the integration. Start times of replication jobs are based on the start time and duration of the previous job. Refer to the Replication Frequency documentation for more information and examples.
  • Anchor scheduling: Based on the Replication Frequency, or interval, you select, this method “anchors” the start times of this integration’s replication jobs to a time you select to create a predictable schedule. Anchor scheduling is a combination of the Anchor Time and Replication Frequency settings, which must both be defined to use this method. Additionally, note that:

    • A Replication Frequency of at least one hour is required to use anchor scheduling.
    • An initial replication job may not begin immediately after saving the integration, depending on the selected Replication Frequency and Anchor Time. Refer to the Anchor Scheduling documentation for more information.

    • You’ll need to contact support to request using an Anchor Time with this integration.

To keep your row usage low, consider setting the integration to replicate less frequently. See the Understanding and Reducing Your Row Usage guide for tips on reducing your usage.


Step 8: Select data to replicate

The last step is to select select the collections you want to replicate.

When you track a collection, you’ll also need to define its Replication Key. Note: Any table set to replicate will use Key-based Incremental Replication. Stitch doesn’t currently support other Replication Methods for MongoDB integrations.

You can select collections by:

  1. In the Integration Details page, click the Tables to Replicate tab.
  2. Locate a collection you want to replicate.
  3. Click the checkbox next to the object’s name. A green checkmark means the object is set to replicate.
  4. If there are child objects, they’ll automatically display and you’ll be prompted to select some.
  5. After you set a collection to replicate, the Settings page will display. Note: When you track a table, by default all fields will also be tracked; additionally, tracking individual fields isn’t currently supported at this time.

  6. In the Settings page, define the collection’s Replication Key.

  7. Repeat this process for every collection you want to replicate.

Initial and historical replication jobs

After you finish setting up MongoDB, its Sync Status may show as Pending on either the Stitch Dashboard or in the Integration Details page.

For a new integration, a Pending status indicates that Stitch is in the process of scheduling the initial replication job for the integration. This may take some time to complete.

Free historical data loads

The first seven days of replication, beginning when data is first replicated, are free. Rows replicated from the new integration during this time won’t count towards your quota. Stitch offers this as a way of testing new integrations, measuring usage, and ensuring historical data volumes don’t quickly consume your quota.



Troubleshooting

SSL Connection Errors

Prematurely reached end of file/stream

Applicable only to MongoDB integrations, this error usually means that SSL has been incorrectly configured.

Connecting a database integration to Stitch via SSL has two parts: configuration on the database’s server and in the Stitch app. For the connection to be successful, the settings in both Stitch and on the database server must align.

For example: a MongoDB server doesn’t support SSL connections but the SSL option is checked in Stitch. This will result in a connection error.

First, verify if the MongoDB server is configured to support SSL connections. Then:

  • If SSL connections aren’t supported, make sure the Connect using SSL box in Stitch is unchecked and try saving the integration again.

  • If SSL connections are required, make sure the Connect using SSL box in Stitch is checked and try saving the integration again.

Fields Missing from Replication Key Menu

If fields you expect to see are missing from a collection’s Replication Key menu, it may be that the fields aren’t indexed. Refer to the Mongo Replication Keys guide for more info.


Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.