Stitch is releasing an updated version of our BigQuery destination that allows users to pick append-only or insert/update ("upsert") loading behavior for data flowing from Stitch integrations.
Since Stitch was founded, we've had a front-row seat to the exploding market for cloud data warehouses, a segment projected to reach nearly $35 billion in global spend by 2025. I'd even wager that the Stitch support team spends more time talking about cloud data warehousing than anyone not named AWS, Google, or Snowflake.
In the course of those conversations, we've collected a fair bit of feedback. One of the most common requests we hear comes from Google BigQuery users, who want the option to utilize BigQuery's upsert behavior to directly update data loaded by Stitch and avoid the need for deduplication.
Today we're announcing a revamped BigQuery destination in open beta with upsert and append-only options, along with other improvements, including support for authentication via Google service accounts, full Google Cloud Storage region support, and programmatic management via the Stitch Connect API.
Append-only versus upsert behavior
When we released our original BigQuery destination in 2016, Stitch supported loading data in an append-only manner. Google was adding BigQuery functionality at the time but as a young product, we had limited data to tell us how BigQuery users wanted Stitch to behave with their warehouse.
With the addition of upsert behavior in the latest version of our BigQuery destination, we now have two options for users to specify how Stitch loads data:
- Upsert behavior updates existing rows in tables with defined primary keys. A single version of the row will exist in the table, providing the most recent records in a snapshot of the data loaded by Stitch.
- Append-only behavior doesn't update existing rows. New versions of a row are appended to the existing table, creating a record of how a row has changed over time. While append-only behavior provides visibility into changes over time, getting to a snapshot of the most recent data requires deduplication logic that can become complicated.
We're excited to offer both options to our growing list of BigQuery users. Check out our loading behavior reference guide for more info.
Authentication, API access, and more
In addition to upsert loading behavior, we've added three other new features to improve the Stitch–BigQuery experience:
- Stitch users can now authenticate with Google service accounts. Service account authentication allows Stitch to remove individual users from the authentication process and offers permission management from the GCP console.
- Our BigQuery destination creates a Google Cloud Storage bucket that Stitch uses as part of the replication process. Stitch supports all GCS locations and fully manages the storage bucket on a user's behalf.
- With this release, Stitch customers on enterprise plans can configure their BigQuery destination programmatically using the Stitch Connect API. Read about our enterprise plans and contact our sales team to learn more.
That's all, folks! If you still don't have a Stitch account, sign up for free and get data moving to BigQuery in just a few clicks.