data.world Destination | Stitch Documentation

Data.world helps you host and share your data, collaborate with your team, and capture context and conclusions as you work.

This guide serves as a reference for version 1 of Stitch’s data.world destination.

Details and features

Stitch features
Destination details
data.world pricing

Stitch features

High-level details about Stitch’s implementation of data.world, such as supported connection methods, availability on Stitch plans, etc.

Release status	Released
Stitch plan availability	All Stitch plans
Stitch supported regions	North America (AWS us-east-1) Operating regions determine the location of the resources Stitch uses to process your data. Learn more.
Supported versions	Not applicable
Connect API availability	Unsupported This version of the data.world destination is not currently available in Stitch’s Connect API.
SSH connections	Unsupported Stitch does not support using SSH tunnels to connect to data.world destinations.
SSL connections	Supported Stitch will attempt to use SSL to connect by default. No additional configuration is needed.
VPN connections	Not applicable
Static IP addresses	Supported This version of the data.world destination has static IP addresses that can be whitelisted.
Default loading behavior	Upsert Note: Append-Only loading will be used if all conditions for Upsert are not met. Learn more.
Nested structure support	Supported Learn more.

Destination details

Details about the destination, including object names, table and column limits, reserved keywords, etc.

Note: Exceeding the limits noted below will result in loading errors or rejected data.

Maximum record size	20MB
Table name length	None
Column name length	None
Maximum columns per table	None
Maximum table size	None
Maximum tables per database	None
Case sensitivity	Not applicable
Reserved keywords	None

data.world pricing

While Stitch is compatible with all of data.world plans, keep in mind that the number of private projects/datasets and the size maximum of those projects varies by plan.

For more information on data.world’s plans, refer to their pricing page.

Replication process overview

Click to enlarge

Steps 1 & 2: Data extraction

Stitch requests and extracts data from a data source. Refer to the System overview guide for a more detailed explanation of the Extraction phase.

Step 3: Load data into shared S3 bucket

Stitch will loads the raw JSON data into an Amazon S3 bucket shared between Stitch and data.world.

Step 4: Notify data.world

After Stitch successfully finishes loading into Amazon S3, a webhook notification is sent to data.world to trigger the retrieval process.

Steps 5 & 6: data.world retrieves data from S3

Data.world retrieves the data destined for your account from the shared Amazon S3 bucket.

Step 7: data.world loads your data

Data.world loads the data into your data.world account. Refer to the Loading behavior section below for more info on how your data will be structured in data.world.

Loading behavior

When data.world retrieves an integration’s data from the Amazon S3 bucket, it will be loaded into your data.world account as a project with child datasets.

For each integration you connect to Stitch, a project with the same name will be created in data.world. The tables you set to replicate will be stored as JSON datasets within the project.

For example: If you named an integration HubSpot in Stitch and selected the companies, contact_lists, and contacts tables to replicate, your workspace in your data.world account would be the same as the image on the right.

The dataset schema will contain the attributes you set to replicate in Stitch along with a few _sdc columns.

Incompatible sources

No compatibility issues have been discovered between data.world and Stitch's integration offerings.

See all destination and integration incompatibilities.

Transformations

System columns
JSON structures

System columns

The dataset schema will contain the attributes you set to replicate in Stitch along with a few _sdc columns. These are system columns generated by Stitch for replicating data.

For information about the data available in SaaS integrations - including column descriptions and potential data values - refer to the Schema section of any of our integrations docs.

JSON structures

All replicated data is stored as JSON, both in Amazon S3 and in data.world after the final load is complete. This means that nested structures are stored intact.

Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.

Related	Troubleshooting
Choosing a Destination Destination & Integration Compatibility Loading Data into Your Destination Switching Destinations	Destination Connection Errors