When data is loaded into your destination, Stitch will create some additional columns and tables. These columns and tables are used not only in the replication process, but to provide additional insight and transparency about your data as it moves through Stitch.

In this guide, we’ll cover:


System columns

When Stitch loads data into a destination table, two types of columns will be created: The columns you set to replicate, and Stitch system columns. Stitch system columns are prepended with _sdc.

These columns contain metadata about records as they move through Stitch’s replication process, such as when they were extracted from the source or batched for loading.

The system columns Stitch adds to a table depend on table and integration type. System columns can fall into one of the following categories:

Note: Removing or blocking access to system columns will cause replication issues. Stitch relies on these columns to correctly replicate and load your data.

All integration tables

These columns are applicable to all tables and integration types. Unless noted, every column in this list will be present in every integration table created by Stitch.

Note: These columns aren’t applicable to system tables.

Column name Data type Description
_sdc_extracted_at TIMESTAMP

Applicable only to certain Singer-backed integrations. Timestamp indicating when Stitch extracted the record from the source.

Example data: 2019-08-08 14:52:05.773+00
_sdc_received_at TIMESTAMP

Timestamp indicating when Stitch received the record for loading.

Example data: 2019-08-08 14:54:39.109+00
_sdc_batched_at TIMESTAMP

Timestamp indicating when Stitch loaded the batch the record was a part of into the destination.

Example data: 2019-08-08 14:55:08.82+00
_sdc_sequence INTEGER

A Unix epoch (in nanoseconds) that indicates the order in which data points were considered for loading.

Example data: 1565276078922000095
_sdc_table_version INTEGER

Indicates the version of the table. This column is used to determine when to issue TRUNCATE commands during loading, where applicable.

Example data: 0
__sdc_primary_key STRING

Applicable only if Stitch doesn’t detect a Primary Key in the source table. Stitch will use this column to de-dupe data.

Example data: 5d8b9a05-33cc-4d5f-8163-4474814b46c6

Subtables

These columns are applicable to integration subtables created by Stitch. Subtables are created when Stitch loads nested data into a destination that doesn’t support nested data structures. Refer to the Nested JSON Data Structures and Row Count Impact guide for more info and examples.

Column name Data type Description
_sdc_source_key_[name] VARIES

Forms part of a composite Primary Key for the table. It contains the top level table’s Primary Key and will also inherit the Primary Key’s data type.

For example: There is a top-level table (orders) with a Primary Key of order_id (INTEGER). If a subtable were created from orders, this column would be named _sdc_source_key_order_id, be typed as an INTEGER, and contain the same values.

Example data: 885035841
_sdc_level_#_id INTEGER

Forms part of a composite Primary Key for the table and can be used to associate further nested records to the parent.

Values in this column will auto-increment for each unique record in the table, beginning with 0. When used with the _sdc_source_key_[name] column(s), a unique identifier for the row is created.

Depending on the level of nesting, there may be multiple _sdc_level_id columns in a subtable.

Example data: 0

Database integrations

These columns are applicable only to database integrations.

Column name Data type Description
_sdc_replication_id STRING

Contains an auto-generated ID that Stitch injects when a Primary Key isn’t detected. Stitch will use this column to de-dupe data.

Example data: 87329b9c-c61f-4c45-847b-407ee8bab7bd
_sdc_source_file STRING

Applicable only to Intacct and Responsys integrations, this column contains the file path on the SFTP server of the table’s source file.

Example data: store/customers.csv
_sdc_source_lineno INTEGER

Applicable only to Intacct and Responsys integrations, this column contains the row number for the record in the source file.

Example data: 5
_sdc_deleted_at TIMESTAMP

Applicable only to database integrations that support Log-based Replication. Timestamp indicating when Stitch detected a DELETE event for the record while reading from the database’s binary log. Otherwise, this field will be null.

Additionally, this column will not populate until after the initial historical replication job has been completed. Stitch can only detect delete row events while reading from binary logs, which are not used during initial replication jobs.

Note: Only certain event types are supported for Log-based Replication. Refer to the Deleted record handling guide for more info.

Example data: 2019-08-08 14:49:38.593+00
_sdc_extra STRING

Applicable only to Amazon S3 CSV integrations. An array of strings containing:

  • Key-value pairs of duplicate columns, and/or
  • Extra values in a record that aren’t associated with a column

Refer to the Amazon S3 CSV documentation for more info.

Example data: [{"name":"Jake the human"},{"no_headers":["16"]}]

SaaS integrations

These columns are applicable only to SaaS integrations.

Column name Data type Description
_sdc_customer_id STRING

Applicable only to Google Ads integrations, this column contains the Ads Account ID the data in the row pertains to.

For example: The ad_performance_report table may contain data from several Ads accounts. The value in the _sdc_customer_id column allows you to associate records with the correct Ads account.

Example data: 4224806558
_sdc_record_hash STRING

Applicable only to Google Analytics (v1), Google Ads (v1) and NetSuite Suite Analytics integrations, this column is a Stitch-generated SHA 256 hash that should be used as a Primary Key.

For Google Analytics integrations, the hash consists of a UTF-8 encoded JSON list containing:

  • The account ID, web property ID, and profile ID of the associated report
  • Pairs of ga:dimension_name and dimension_value
  • The start_date and end_date values for the record, in YYYY-mm-dd format

For Google Ads integrations, the hash consists of a UTF-8 encoded JSON list containing the segments and attributes in the report.

For NetSuite Suite Analytics integrations, the hash consists of the values of the columns designated as Primary Keys by NetSuite for the table.

Example data: 070265d7b7de40082271712d39839cbde58f135e5145d919856aa6737816becc
_sdc_report_datetime DATE-TIME

Applicable only to Google Ads and Campaign Manager inetgrations, this column indicates the starting time of the Stitch job that extracted the relevant row’s data.

Example data: 2019-08-08 14:49:38.593+00

System tables

In addition to the _sdc columns, Stitch will create the following system tables in your destination:

Table name Description
_sdc_rejected

This table acts as a log for a given integration’s rejected records. Every integration schema created by Stitch will contain this table.

Note: For destinations that don’t allow leading underscores in object names, this table will be named sdc_rejected.

Refer to the _sdc_rejected documentation for more info and column descriptions.
_sdc_primary_keys

Applicable only to Google BigQuery (v2) and Microsoft Azure Synapse Analytics destinations, the _sdc_primary_keys table contains a list of the tables in an integration schema and the columns those tables use as Primary Keys. Every integration schema created by Stitch will contain this table.

Refer to the _sdc_primary_keys documentation for more info and column descriptions.


Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.