MongoDB feature snapshot

A high-level look at Stitch's MongoDB (v1.0) integration, including release status, useful links, and the features supported in Stitch.

STITCH
Release Status Beta Supported By

Stitch

Stitch Plan

Standard

Supported Versions

2.6 through 4.0

CONNECTION METHODS
SSH Connections

Supported

SSL Connections

Supported

REPLICATION SETTINGS
Anchor Scheduling

Supported

Advanced Scheduling

Supported

Table-level Reset

Supported

Configurable Replication Methods

Supported

REPLICATION METHODS
Log-based Replication

Supported

Key-based Replication

Supported

Full Table Replication

Supported

DATA SELECTION
Table Selection

Supported

Column Selection

Supported

View Replication

Unsupported

TRANSPARENCY
Extraction Logs

Supported

Loading Reports

Supported

Connecting MongoDB

MongoDB setup requirements

To set up MongoDB in Stitch, you need:

  • A Standard or higher Stitch plan. While those currently in the Free Trial will also be able to set up MongoDB, replication will be paused until a Standard plan or higher is selected after the trial ends.
  • Privileges in MongoDB that allow you to create/manage users. This is required to create the Stitch database user.

  • If using Log-based Incremental Replication, the userAdmin or userAdminAnyDatabase role. This is required to configure the database server for OpLog.

  • A MongoDB server that uses Auth mode. Auth mode requires every user who connects to Mongo to have a username and password. These credentials must be validated before the user will be granted access to the database.

  • A MongoDB database using a version between 2.6 and 4.0. While older versions may be connected to Stitch, we may not be able to provide support for issues that arise due to unsupported versions.

    We recommend always keeping your version current as a best-practice. If you encounter connection issues or other unexpected behavior, verify that your MongoDB version is one supported by Stitch.

  • If using SSL, your server must require SSL connections. Note: SSL isn’t required to connect a MongoDB database to Stitch.


Step 1: Configure database connection settings

In this step, you’ll configure the database server to allow traffic from Stitch to access it. There are two ways to connect your database:

  • A direct connection will work if your database is publicly accessible.
  • An SSH tunnel is required if your database isn’t publicly accessible. This method uses a publicly accessible instance, or an SSH server, to act as an intermediary between Stitch and your database. The SSH server will forward traffic from Stitch through an encrypted tunnel to the private database.

Click the option you’re using below and follow the instructions.

For the connection to be successful, you’ll need to configure your firewall to allow access from our IP addresses. Whitelist the following IPs before continuing onto the next step:

  • 52.23.137.21/32

  • 52.204.223.208/32

  • 52.204.228.32/32

  • 52.204.230.227/32

  1. Follow the steps in the Setting up an SSH Tunnel for a database connection guide to set up an SSH tunnel for MongoDB.
  2. Complete the steps in this guide after the SSH setup is complete.

Step 2: Create a Stitch database user

Step 2.1: Connect to your database

  1. Connect to your MongoDB server.
  2. Navigate to the authentication database. In this example, we’re using admin:

    mongo "mongodb://<username>@<database-host>:<port>/?authSource=admin"
    

    Replace <username>, <database-host>, and <port> with your MongoDB username, database host address, and the port used by the database, respectively.

    Note: If you’re connecting an Atlas-based instance, the authentication database will always be admin.

Step 2.2: Create the Stitch user

Next, you’ll create the Stitch user, set a password, and assign roles. This guide uses the built-in readAnyDatabase role, but you can use or create another role as long as it assigns the same privileges.

Select the version your MongoDB database is using to view the correct command to create the Stitch database user.

Create the user, using the addUser command for MongoDB versions 2.4 through 2.6. Replace <password> with a password:

use admin
db.addUser(
  {
    user: "stitch",
    pwd: "<password>",
    roles: ["readAnyDatabase"]
  }
)

Create the user, using the createUser command for MongoDB versions 3.0 through 3.2. Replace <password> with a password:

use admin
db.createUser(
  {
    user: "stitch",
    pwd: "<password>",
    roles: ["readAnyDatabase"]
  }
)

For versions 3.4 and above, the readAnyDatabase role doesn’t include the local database. Create the user, granting the additional read role on the local database:

use admin
db.createUser(
  {
    user: "stitch",
    pwd: "<password>",
    roles: ["readAnyDatabase", {role: "read", db: "local"} ]
  }
)

See the Privileges list tab for an explanation of why these permissions are required by Stitch.

In the table below are the database user privileges Stitch requires to connect to and replicate data from a MongoDB database.

Privilege name Reason for requirement
readAnyDatabase

Required to read data from databases in the cluster.

read

Note: You only need to explicitly grant this role if you’re using MongoDB version 3.4 or greater.

Required to read from the local database.

Step 3: Configure Log-based Incremental Replication

While Log-based Incremental Replication is the most accurate and efficient method of replication, using this replication method may, at times, require manual intervention or impact the source database’s performance. Refer to the Log-based Incremental Replication documentation for more info.

You can also use one of Stitch’s other Replication Methods, which don’t require any database configuration. Replication Methods can be changed at any time.

In this section:

Step 3.1: Create a replica set

In this step, you’ll edit the /etc/mongod.conf file to add a replica set. A replica set is a group of mongod processes that maintain the same dataset.

  1. Start the MongoDB instance:

    mongod --port 27017
    
  2. Connect to the Mongo shell as a root user:

    mongo --port 27017
    
  3. Navigate to the /etc/mongod.conf file.

  4. In /etc/mongod.conf, uncomment replication and specify a name for the replica set (replSetName). In this example, we’re using rs0 as the replica set name:

    replication:
       replSetName: "rs0"
    

    Note: As /etc/mongod.conf is a protected file, you may need to assume sudo to edit it.

  5. Save the changes.

Step 3.2: Initiate the replica set

Next, you’ll restart the instance and initiate the replica set.

  1. Restart mongod with the configuration file:

    sudo mongod --auth --config /etc/mongod.conf
    
  2. Connect to the Mongo shell as a root user, replacing <root_username> and <password> with the root user’s username and password:

    mongo --port 27017 -u <root_username> -p <password> --authenticationDatabase admin
    
  3. Initiate the replica set, replacing <host_address> with the IP address or endpoint used by the mongod instance:

    rs.initiate({_id: "rs0", members: [{_id: 0, host: "<host_address>:27017"}]})
    

If successful, you’ll receive a response similar to the following:

{ "ok" : 1 }

Step 3.3: Verify OpLog setup and access

Lastly, you’ll verify that the Stitch user can read from the OpLog.

  1. Disconnect from the Mongo shell.

  2. Reconnect as the Stitch database user you created in Step 2. Replace <stitch_username> and <password> with the Stitch user’s username and password, respectively:

    mongo --port 27017 -u <stitch_username> -p <password> --authenticationDatabase admin
    
  3. Switch to the local database:

    use local
    
  4. View OpLog rows:

    db.oplog.rs.find()
    

If successful, records from the OpLog similar to the following will be returned:

{ "ts" : Timestamp(1524038245, 63), "t" : NumberLong(1), "h" : NumberLong("-596019791399272412"), "v" : 2, "op" : "i", "ns" : "stitchTest.customers", "ui"
: UUID("0e623d9c-722c-41d5-a5e6-83947cc2466e"), "wall" : ISODate("2018-04-18T07:57:25.065Z"), "o" : { "_id" : 100, "name" : "Finn" } }

Step 4: Connect Stitch

In this step, you’ll complete the setup by entering the database’s connection details and defining replication settings in Stitch.

Step 4.1: Define the database connection details

  1. If you aren’t signed into your Stitch account, sign in now.
  2. On the Stitch Dashboard page, click the Add Integration button.

  3. Locate and click the MongoDB icon.
  4. Fill in the fields as follows:

    • Integration Name: Enter a name for the integration. This is the name that will display on the Stitch Dashboard for the integration; it’ll also be used to create the schema in your destination.

      For example, the name “Stitch MongoDB” would create a schema called stitch_mongodb in the destination. Note: The schema name cannot be changed after the integration is saved.

    • Host (Endpoint): Enter the host address (endpoint) used by the MongoDB instance. For example: This could be a network address such as 192.68.0.1, or a server endpoint like dbname.hosting-provider.com.

    • Port: Enter the port used by the instance. The default is 27017.

    • Username: Enter the Stitch MongoDB database user’s username.

    • Password: Enter the password for the Stitch MongoDB database user.

    • Authentication Database: Enter the name of the Stitch user’s authentication database. This is the name of the database where the Stitch user was initially created.

      Note: If you’re connecting an Atlas-based MongoDB instance, this must be the admin database. See the Create a Mongo database user section for more info on this requirement.

    • Replica Set: Optional. The name of the replica set to be used for Log-based Incremental Replication.

    • Include MongoDB database names in destination tables: Checking this setting will include database names from the source database in the destination table name - for example: <source_database_name>__<collection_name>.

      Stitch loads all selected replicated collections to a single schema, preserving only the collection name. If two collection canonicalize to the same name - even if they’re in different source databases - name collision errors can arise. Checking this setting can prevent these issues.

      Note: This setting can not be changed after the integration is saved. Additionally, this setting may create table names that exceed your destination’s limits. For more info, refer to the Database Integration Table Name Collisions guide.

Step 4.2: Define the SSH connection details

If you’re using an SSH tunnel to connect your MongoDB database to Stitch, you’ll also need to define the SSH settings. Refer to the Setting up an SSH Tunnel for a database connection guide for assistance with completing these fields.

  1. Click the SSH Tunnel checkbox.

  2. Fill in the fields as follows:

    • SSH Host: Enter the public IP address or hostname of the server Stitch will SSH into.

    • SSH Port: Enter the SSH port on your server. (22 by default)

    • SSH User: Enter the Stitch Linux (SSH) user’s username.

Step 4.3: Define the SSL connection details

Click the Connect using SSL checkbox if you’re using an SSL connection. Note: The database must support and allow SSL connections for this setting to work correctly.

Step 4.4: Define Log-based Replication setting

In the Log-based Replication section, you can set this as the integration’s default Replication Method.

When enabled, tables that are set to replicate will use Log-based Incremental Replication by default. If you don’t want a table to use Log-based Incremental Replication, you can change it in the Table Settings page for that table.

If this setting isn’t enabled, you’ll have to select a Replication Method for each table you set to replicate.

Step 4.5: Create a replication schedule

In the Replication Frequency section, you’ll create the integration’s replication schedule. An integration’s replication schedule determines how often Stitch runs a replication job, and the time that job begins.

MongoDB integrations support the following replication scheduling methods:

To keep your row usage low, consider setting the integration to replicate less frequently. See the Understanding and Reducing Your Row Usage guide for tips on reducing your usage.

Step 4.6: Save the integration

When finished, click Check and Save.

Stitch will perform a connection test to the MongoDB database; if successful, a Success! message will display at the top of the screen. Note: This test may take a few minutes to complete.

Step 5: Select data to replicate

The last step is to select select the collections and fields you want to replicate.

When you track a collection, Stitch will use the default Replication Method selected in the Integration Settings page. You can choose a different Replication Method for individual collections during this process.

You can select collections and fields by:

  1. In the Integration Details page, click the Collections to Replicate tab.
  2. Locate a collection you want to replicate.
  3. Click the checkbox next to the object’s name. A green checkmark means the object is set to replicate.
  4. If there are child objects, they’ll automatically display and you’ll be prompted to select some. Note: When you track a table, by default all fields will also be tracked.
  5. After you set a collection to replicate, a page with the collection’s fields will display. De-select fields if needed.

  6. In the Settings page, define the collection’s Replication Method and, if using Key-based Incremental Replication, its Replication Key.

  7. Optional: Select or exclude fields by entering a projection query in the Fields to Replicate section. Refer to the Selecting MongoDB Fields Using Projection Queries guide for instructions and examples.

  8. Repeat this process for every collection you want to replicate.

  9. Click the Finalize Your Selections button to save your data selections.

Initial and historical replication jobs

After you finish setting up MongoDB, its Sync Status may show as Pending on either the Stitch Dashboard or in the Integration Details page.

For a new integration, a Pending status indicates that Stitch is in the process of scheduling the initial replication job for the integration. This may take some time to complete.

Free historical data loads

The first seven days of replication, beginning when data is first replicated, are free. Rows replicated from the new integration during this time won’t count towards your quota. Stitch offers this as a way of testing new integrations, measuring usage, and ensuring historical data volumes don’t quickly consume your quota.


MongoDB replication

MongoDB Replication Keys

Unlike Replication Keys for other database integrations, those for MongoDB have special considerations due to MongoDB functionality. For example: MongoDB allows multiple data types in a single field, which can cause records to be skipped during replication.

Refer to the MongoDB Replication Keys guide before you define the Replication Keys for your collections, as incorrectly defining Replication Keys can cause data discrepancies.

Heavily nested data and destination column limits

MongoDB documents can contain heavily nested data, meaning an attribute can contain many other attributes.

If your destination doesn’t natively support nested data structures, Stitch will de-nest them to load them into the destination. Depending on how deeply nested the data is and the per table column limit of the destination, Stitch may encounter issues when loading heavily nested data.

Refer to the Nested Data Structures guide for more info and examples.


Troubleshooting

SSL Connection Errors

Prematurely reached end of file/stream

Applicable only to MongoDB integrations, this error usually means that SSL has been incorrectly configured.

Connecting a database integration to Stitch via SSL has two parts: configuration on the database’s server and in the Stitch app. For the connection to be successful, the settings in both Stitch and on the database server must align.

For example: a MongoDB server doesn’t support SSL connections but the SSL option is checked in Stitch. This will result in a connection error.

First, verify if the MongoDB server is configured to support SSL connections. Then:

  • If SSL connections aren’t supported, make sure the Connect using SSL box in Stitch is unchecked and try saving the integration again.

  • If SSL connections are required, make sure the Connect using SSL box in Stitch is checked and try saving the integration again.

Fields Missing from Replication Key Menu

If fields you expect to see are missing from a collection’s Replication Key menu, it may be that the fields aren’t indexed. Refer to the Mongo Replication Keys guide for more info.


Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.