Depending on the Replication Method being used and how records are deleted in the source, deletes may not be captured during the replication process.


Deletion methods

There are two methods that can be used to delete a source record:

  • Soft deletes, which will leave a record in the source and use a flag to indicate deletion, such as is_deleted or deleted_on. If the delete event updates the record’s Replication Key value, Stitch will detect and replicate the changes.
  • Hard deletes, which completely remove records from the source. It’s as if the record never existed. If using Key-based Incremental Replication, this will remove the record’s Replication Key value, which Stitch uses to identify new and updated records. Without a Replication Key value to check, Stitch can’t identify the change and update the record in the destination.

Delete support overview

In the table below are each of Stitch’s Replication Methods and the level at which each deletion method is supported.

Click the Replication Method name to check out examples of how each deletion method works with that specific Replication Method.

Replication Method Soft deletes Hard deletes
Full Table Supported

Record will have an updated deleted flag value, if a flag is available in the source.

Sometimes supported

Record may remain or be removed.

Refer to the Full Table Replication section for more info and examples.

Key-based Incremental Supported

Record will have an updated deleted flag value, if a flag is available in the source.

Not supported

Record will have an updated deleted flag value, if a flag is available in the source.

Log-based Incremental Supported

Record will have an updated deleted flag value, if a flag is available in the source.

Sometimes supported

Record will remain in the destination. The _sdc_deleted_at column will be populated with a timestamp value and all columns except for the Primary Key will be overwritten.

Note: Only deletes completed as part of a supported event type will be captured.


Full Table Replication

Click the tabs below to view examples for how each deletion method is handled using Full Table Replication.

Soft deletes with Full Table

In this example, we’ll demonstrate what happens when two records are soft-deleted in the source. This table uses the is_deleted column as a deletion flag, which will be true if the record is deleted.

Soft delete: Initial job

During the initial job:

  • In the source: A table contains three records: id: 1, id: 2, and id: 3.

  • During the job: Stitch replicates the entire table.

  • In the destination: Stitch loads the entire table in full.

Source
id is_deleted
1 false
2 false
3 false
Destination
id is_deleted
1 false
2 false
3 false
Soft delete: Second job

During the second job:

  • In the source: Records 1 and 3 are soft deleted (is_deleted: true).

  • During the job: Stitch replicates the entire table.

  • In the destination: Stitch overwrites the table in the destination, resulting in the is_deleted values for records 1 and 3 being updated.

Source
id is_deleted
1 true
2 false
3 true
Destination
id is_deleted
1 true
2 false
3 true
Hard deletes with Full Table

In this example, we’ll demonstrates what happens when a record is hard-deleted - or removed entirely - from the source.

Hard delete: Initial job

During the initial job:

  • In the source: A table contains three records: id: 1, id: 2, and id: 3.

  • During the job: Stitch replicates the entire table.

  • In the destination: Stitch loads the table in full.

Source
id name
1 Finn
2 Jake
3 Bubblegum
Destination
id name
1 Finn
2 Jake
3 Bubblegum
Hard delete: Second job

During the second job:

  • In the source: Before the second job begins, record 2 is hard deleted and removed entirely from the source.

  • During the job: Stitch replicates the table in full.

  • In the destination: Stitch deletes and recreates the table in a single transaction, loading the replicated table in full.

Source
id name
1 Finn
3 Bubblegum
Destination
id name
1 Finn
3 Bubblegum
Hard deletes with Append-Only loading

In this example, we’ll demonstrates what happens when a record is hard-deleted from the source and loaded into a destination using Append-Only loading.

How hard deletes are handled in an Append-Only destination depends on the integration:

  • If you can define Replication Methods for an integration, Stitch will delete and re-create tables in a single transaction. This is the scenario described in the Hard deletes tab for this section.
  • Otherwise, Stitch will append the table in full to the destination table. This is the scenario we’ll outline here.
Hard delete with Append-Only loading: Initial job

During the initial job:

  • In the source: A table contains three records: id: 1, id: 2, and id: 3.

  • During the job: Stitch replicates the entire table.

  • In the destination: Stitch loads the table in full.

Source
id name
1 Finn
2 Jake
3 Bubblegum
Destination
id name
1 Finn
2 Jake
3 Bubblegum
Hard delete with Append-Only loading: Second job

During the second job:

  • In the source: Before the second job begins, record 2 is hard deleted and removed entirely from the source.

  • During the job: Stitch replicates the table in full.

  • In the destination: Stitch appends the replicated data to the end of the table in the destination. In this case, records 1 and 3 would be appended.

Source
id name
1 Finn
3 Bubblegum
Destination
id name
1 Finn
2 Jake
3 Bubblegum
1 Finn
3 Bubblegum

Key-based Incremental Replication

Click the tabs below to view examples for how each deletion method is handled using Key-based Incremental Replication.

Soft deletes with Key-based Incremental

In this example, we’ll demonstrate what happens when two records are soft-deleted in the source. This table uses the is_deleted column as a deletion flag, which will be true if the record is deleted.

Soft delete: Initial job

During the initial job:

  • In the source: A table contains three records: id: 1, id: 2, and id: 3.

  • During the job: Depending on the type of integration, the initial extraction will vary slightly:

    • Database integration: Stitch replicates the table in full.
    • SaaS integration: Stitch uses the Start Date defined in the Integration Settings as the initial Replication Key (updated_at) value. Records with an updated_at value greater than or equal to the Start Date are replicated.

    For all integrations, Stitch saves updated_at: 2018-07-27 12:07:00 as the maximum Replication Key value.

  • In the destination: Stitch loads the table.

Source
id updated_at is_deleted
1 2018-07-27 12:00:00 false
2 2018-07-27 12:05:00 false
3 2018-07-27 12:07:00 false
Destination
id updated_at is_deleted
1 2018-07-27 12:00:00 false
2 2018-07-27 12:05:00 false
3 2018-07-27 12:07:00 false
Soft delete: Second job

During the second job:

  • In the source: Records 1 and 3 are soft deleted (is_deleted: true). The Replication Key value (updated_at) is updated.

  • During the job: Stitch replicates all records with an updated_at value >= 2018-07-27 12:07:00.

  • In the destination: The is_deleted and updated_at values are updated for records 1 and 3.

Source
id updated_at is_deleted
1 2018-07-28 04:00:00 true
2 2018-07-27 12:05:00 false
3 2018-07-28 04:05:00 true
Destination
id updated_at is_deleted
1 2018-07-28 04:00:00 true
2 2018-07-27 12:05:00 false
3 2018-07-28 04:05:00 true
Hard deletes with Key-based Incremental

In this example, we’ll demonstrates what happens when a record is hard-deleted - or removed entirely - from the source.

Note: To remove hard deletes from destination tables replicated in this way, you’ll need to:

  1. Temporarily drop the table in the destination
  2. Reset the table’s Replication Keys
  3. Wait for Stitch to re-create the table during the next replication job

These steps must be done in this order, or you may cause issues with replication.

Hard delete: Initial job

During the initial job:

  • In the source: A table contains three records: id: 1, id: 2, and id: 3.

  • During the job: Depending on the type of integration, the initial extraction will vary slightly:

    • Database integration: Stitch replicates the table in full.
    • SaaS integration: Stitch uses the Start Date defined in the Integration Settings as the initial Replication Key value. Records with a Replication Key value greater than or equal to the Start Date are replicated.

    For all integrations, Stitch saves updated_at: 2018-07-27 12:07:00 as the maximum Replication Key value.

  • In the destination: Stitch loads the table.

Source
id updated_at
1 2018-07-27 12:00:00
2 2018-07-27 12:05:00
3 2018-07-27 12:07:00
Destination
id updated_at
1 2018-07-27 12:00:00
2 2018-07-27 12:05:00
3 2018-07-27 12:07:00
Hard delete: Second job

During the second job:

  • In the source: Before the second job begins, record 2 is hard deleted and removed entirely from the source. Records 1 and 3 are updated.

  • During the job: Stitch replicates all records with an updated_at value >= 2018-07-27 12:07:00.

  • In the destination: The updated_at values for records 1 and 3 are updated.

    Record 2 remains in the destination table.

Source
id updated_at
1 2018-07-28 04:15:00
3 2018-07-28 04:07:00
Destination
id updated_at
1 2018-07-28 04:15:00
2 2018-07-27 12:05:00
3 2018-07-28 04:07:00

Log-based Incremental Replication

Click the tabs below to view examples for how each deletion method is handled using Log-based Incremental Replication.

Soft deletes with Log-based Incremental

In this example, we’ll demonstrate what happens when two records are soft-deleted in a source table using Log-based Incremental.

Soft delete: Initial job

During the initial job:

  • In the source: A table contains three records: id: 1, id: 2, and id: 3.

  • During the job: Stitch uses SELECT replication to replicate the table in full.

  • In the destination: Stitch loads the table in full.

Source
id name is_deleted
1 Finn false
2 Jake false
3 Bubblegum false
Destination
id name is_deleted
1 Finn false
2 Jake false
3 Bubblegum false
Soft delete: Second job

During the second job:

  • In the source: Records 1 and 3 are soft deleted (is_deleted: true) using an UPDATE event.

    As UPDATE is a supported event type for Log-based Replication, log messages describing the modified is_deleted values for records 1 and 3 are written to the database’s binary log.

  • During the job: After the historical replication job, Stitch will read updates for the table from the database’s binary logs.

    Stitch reads the log messages for records 1 and 3 and replicates the updated records.

  • In the destination: The is_deleted and updated_at values are updated for records records 1 and 3.

Source
id name is_deleted
1 Finn true
2 Jake false
3 Bubblegum true
Destination
id name is_deleted
1 Finn true
2 Jake false
3 Bubblegum true
Hard deletes with Log-based Incremental

In this example, we’ll demonstrate what happens when a record is hard deleted - or removed entirely - from a source table using Log-based Incremental. In this situation, Stitch will use a TIMESTAMP column named _sdc_deleted_at to indicate when records have been identified as deleted in the source.

Note: Identifying hard deleted records is dependent on how the record is deleted, as not all event types are recorded by databases in binary logs. For example: If a record is deleted using TRUNCATE in a MySQL database, Stitch will not detect the deletion.

The following event types will be detected by Log-based Incremental for database integrations that support this Replication Method:

  • INSERT
  • UPDATE
  • DELETE
Hard delete: Initial job

During the initial job:

  • In the source: A table contains three records: id: 1, id: 2, and id: 3.

  • During the job: Stitch uses SELECT replication to replicate the table in full.

  • In the destination: Stitch loads the table in full.

Source
id name
1 Finn
2 Jake
3 Bubblegum
Destination
id name _sdc_deleted_at
1 Finn
2 Jake
3 Bubblegum
Hard delete: Second job

During the second job:

  • In the source: Record 2 is hard deleted using a DELETE event.

    As DELETE is a supported event type for Log-based Replication, log messages describing the delete of record 2 are written to the database’s binary log.

  • During the job: After the historical replication job, Stitch will read updates for the table from the database’s binary logs.

    Stitch reads the log message for record 2 and identifies the record as deleted.

  • In the destination: Record 2 is overwritten with the exception of its Primary Key value. Stitch will place a timestamp in the _sdc_deleted_at column to indicate when the deletion took place.

Source
id name
1 Finn
2 Jake
3 Bubblegum
Destination
id name _sdc_deleted_at
1 Finn
2 2018-07-28 04:05:00
3 Bubblegum


Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.