The Stitch Import API is a REST API that allows you to push arbitrary data into your data warehouse. Once the data enters the Import API, it’ll be processed and sent through Stitch like data from any other integration.

The Import API accepts JSON or Transit and returns JSON for all of its methods. Each method uses a standard HTTP verb (GET/POST) and standard HTTP response codes for returning statuses.

Before you get started, consider where the data that you want to push currently lives: is it already in a database or SaaS integration we support? A few of our SaaS integrations allow you to sync data at the tabular level; all database integrations (except for Mongo) can be synced to the columnar level. In addition, you can define Replication Methods for tables that come from a database integration.

In short, unless you want complete control over data replication, it’s usually easier to use the native integration instead of using the Import API.


Security

Data submitted to the Import API is handled according to Stitch’s security standards.


Prerequisites

To use the Import API:

  1. You must have a Stitch account. The Import API integration is available to all Stitch users, whether you’re on a Free or Paid tier.
  2. You’ll need a bit of technical know-how. Anyone comfortable writing and maintaining a small Ruby or PHP script should be more than qualified.

Request Requirements

Requests sent to the Import API:

  1. Must be valid JSON or Transit and specified in the request header. See the Upsert section for more info.

    For more info on working with Transit, check out the Transit GitHub repo.

  2. Must contain all of the required request body fields. See the Upsert section for more info.
  3. Must be less than 4MB
  4. Must not contain arrays of data with more than 10,000 individual data points

More info about request requirements can be found in the Upsert section.


Authentication

Authentication with the Import API is done with a single API access token placed in the Authorization header of your request and your client ID, which will be used in the request body.

Step 1: Generate an API Access Token

You can generate an API access token by logging into your Stitch account and adding an Import API integration:

Add Import API as a Stitch Data Source

  1. On the Stitch Dashboard page, click the Add Integration button.
  2. Click the Import API icon.

  3. Enter a name for the integration. This is the name that will display on the for the integration; it’ll also be used to create the schema in your data warehouse.

    For example, the name “Stitch Import API” would create a schema called stitch_import_api in the data warehouse. Note that the schema name cannot be changed after you save the integration.

  4. Click the Save and Generate Token button.

Your API access token has write access to your Stitch account, so keep it secret, keep it safe.

Note that if someone does get ahold of your key, they won’t be able to access your data - they’ll only be able to send it. If at any time your API access token is lost or compromised, you can revoke it and generate a new one.

Example Request Header

The API access token is used to authenticate to the Import API via bearer auth, as demonstrated in the example below:

curl -X POST https://api.stitchdata.com/v2/import/push \
	-H 'Content-Type: application/json' \
	-H 'Authorization: Bearer < [access_token] >' \
	-d @filename

Step 2: Locate Your Client ID

Your client ID is used in the request body in the client_id field to perform the second authentication step for the Import API.

Log into your Stitch account and look at the URL while you’re on the Stitch Dashboard page to find your client ID. It’s the number between client and pipeline:

https://app.stitchdata.com/v2/client/XXXX/pipeline/connections    

    // XXXX is where your client ID will be

Note that your client ID can be between four and six digits.

Example Request Body

[
 {
  "client_id": 4231,
  "table_name": "users",
    "sequence": 100,
  "action": "upsert",
  "key_names": [
   "id"
  ],
  "data": {
   "id": 10,
   "status": "pending"
  }
]


Return Codes

The Import API uses standard HTTP return codes to indicate the status of a request. Generally speaking, codes in:

  • the 2xx range indicate success
  • the 4xx range indicate a bad request - for example: invalid or omitted credentials, a required field is missing, etc.
  • the 5xx range indicate an error on our end. We recommend checking our status page for reported outages if you receive a code in this range. If nothing has been reported and these errors persist, please reach out to our support team.

Your app should handle each of the following return statuses gracefully:

200 - OK Success. New data was added as a result of the request.
201 - Created The request was accepted and will be processed later; new data will be added once the request is processed.
400 - Bad Request The request includes malformed JSON/Transit, did not provide an array of records, or contains more than 20,000 records.
401 - Unauthorized The request did not have a valid API access token. You should generate a new token and re-try your request.
403 - Forbidden The request had a valid API access token, but is not permitted. You should check that the client_id field in your request contains the correct client ID.
405 - Method Not Allowed The request has an HTTP method that is not supported by the endpoint. POST and GET are supported by the Import API. Check that you’re using the correct HTTP method.
413 Request Entity Too Large The size of the request body was over 4MB.
415 Unsupported Media Type The content header of the request did not specify JSON or Transit.
422 Unprocessable Entity The request is missing a required parameter. Check that the array of records in the request contain all of the required request body fields.
503 Service Unavailable The Import API is experiencing problems. Try again later.
504 Gateway Timeout The Import API could not process the request in time. Try again later.

Methods

The Import API supports three methods:

  • Status - The status endpoint can be used to determine if the Import API is operating correctly. This endpoint doesn’t require any authentication.
  • Validate - The validate endpoint can be used to validate requests but will not persist them to Stitch. We recommend using this endpoint for development and testing.
  • Upsert - The upsert endpoint is used to push data into your data warehouse.

Status

The status endpoint can be used to determine if the Import API is operating correctly. This endpoint doesn’t require any authentication.

Example Request

curl -i https://api.stitchdata.com/v2/import/status

200 OK - The Import API is operating correctly.
503 Service Unavailable - The Import API is experiencing problems. Do not attempt to post any data.

Example Response

HTTP/1.1 200 OK
Content-Type: application/json;charset=UTF-8
Date: Thu, 07 Jan 2016 13:13:37 GMT
Server: http-kit
Vary: Accept
Content-Length: 53
Connection: keep-alive

{"name":"pipeline.gate","status":"OK"}

Validate

The validate endpoint can be used to validate requests but will not persist them to Stitch. This endpoint will validate your credentials, meaning you can check your token or client-id while developing and testing. If either credentials are invalid, a 403 Not Authenticated message will be returned.

The behavior of this endpoint mirrors that of the upsert endpoint, with two exceptions:

  • If the request is valid, a 200 OK response will be returned.
  • Regardless of whether Stitch is functional, a 503 Service Unavailable response will never be returned.

Example Request

curl -X POST https://api.stitchdata.com/v2/import/validate \
	-H 'Content-Type: application/json' \
	-H 'Authorization: Bearer < access-token >' \
	-\d @filename

[
   {
      "client_id": 4231,
      "table_name": "users",
      "sequence": 100,
      "action": "upsert"
   }
]

Upsert

The upsert endpoint is used to push data into your data warehouse. In addition to the required request body fields this endpoint will only accept requests that contain the following properties:

  • The data being sent must be valid JSON or Transit AND specified in the request header: Content-Type: application/json
  • If a Primary Key is a string, it must be 256 characters or less.
  • An array of data cannot contain more than 10,000 individual data points.
  • The request body cannot be larger than 4MB.

Example Request Header

curl -X POST https://api.stitchdata.com/v2/import/push \
	-H 'Content-Type: application/json' \
	-H 'Authorization: Bearer < access-token >' \
	-\d @filename

Required Request Body Fields

For the Import API to successfully accept and process your data, your request body must contain all of the request body fields listed below:

client_id This field must contain your client ID.
table_name This field contains the name of the destination table, ex: users. See Defining Tables for more info.
sequence This property tells the Import API the order in which data points in the request body should be considered. See Defining Sequence for more info.
action This field contains the action for the endpoint. For the Upsert endpoint, it will always be upsert.
key_names This field defines the Primary Key and will contain an array of field names that uniquely identify the row that the record belongs to.
data This field contains the data to be upserted into your data warehouse.

If the request was successful, the response will have an HTTP status code of 201 Created and an empty body.

Define Tables

When you push data to an arbitrary table name using the Import API, the table will be generated dynamically in your data warehouse. When creating requests, keep the following in mind:

  1. The Import API doesn’t enforce any limitation on the hierarchy of your tables.
  2. There aren’t any commands to create or destroy a table in the Import API.
  3. You should create one table for each type of data point that you’ll push. For example: if you have customer and product data, you should create customer and product tables.
  4. Each data point pushed to a single table should have the same data structure. For example: if a customers table contains customer_id, name, and email columns, every customer record pushed into this table should contain those columns.
  5. You can push more than one table using the same API access token. Think of it this way: one schema for each API access token. All tables pushed using the same API access token will be housed in the same schema in your data warehouse.

Define the Sequence

Sequence properties communicate the order in which data points should be considered – newer data points can replace older ones, but not vice versa.

Every data point pushed to the Import API must have a sequence property.

A simple solution is just to use the current timestamp, but before doing so, consider the following:

  1. Are the rows being considered frequently updated? Rows that are updated every few milliseconds can result in failure if records with identical key values are pushed simultaneously. This means that records with the same key values cannot be sent during the same clock resolution.

    For example: if the resolution is measured in milliseconds, records with identical key values cannot be sent during the same millisecond.

  2. Are the records coming from multiple sources? If records from multiple sources will be sent to the Import API, the time clocks of these sources must be synchronized. This is especially important if different sources are pushing rows to the same table.

Sequence Example

Take a look at the following example. The first three requests are all for one record (id 10), while the fourth is for a different record (id 22).

If the requests were received in this order:

  • Requests 1 and 2 would continue to Stitch, but not Request 3. This is because Request 3 has a lower sequence value than Request 2.
  • Request 4 would continue to Stitch.
[
 {
  "client_id": 4231,              // Request 1
  "table_name": "users",
  "sequence": 100,
  "action": "upsert",
  "key_names": [
   "id"
  ],
  "data": {
   "id": 10,
   "status": "pending"
  }
 }
]
[
 {
  "client_id": 4231,              // Request 2
  "table_name": "users",
  "sequence": 101,
  "action": "upsert",
  "key_names": [
   "id"
  ],
  "data": {
   "id": 10,
   "status": "canceled"
  }
 }
]
[
 {
  "client_id": 4231,              // Request 3
   "table_name": "users",
   "sequence": 99,
   "action": "upsert",
   "key_names": [
    "id"
   ],
   "data": {
    "id": 10,
    "status": "new"
   }
  }
 ]
[
 {
  "client_id": 4231,              // Request 4
  "table_name": "users",
  "sequence": 90,
  "action": "upsert",
  "key_names": [
   "id"
  ],
  "data": {
   "id": 22,
   "status": "new"
  }
 }
]

Revoke Import API Access Tokens

  1. On the Stitch Dashboard page, click the Import API integration that needs a new token.
  2. When the Integration Details page displays, click Settings.
  3. In the API Access Tokens section, you’ll see a list of tokens currently in use: Revoking an Import API token.
  4. If you haven’t already generated a new token, click Generate to do so now.
  5. Once the new token has been generated, click the Revoke button next to the token you need to revoke.

    Note that the tokens are listed by creation date - if you generated a new token, you’ll want to revoke the oldest token.


Questions? Feedback?

Did this article help? If you have questions or feedback, please reach out to us.

Tags: import api