Need to change the location in which BigQuery stores your data? Using Stitch’s Destination Change feature, you can delete the current destination and connect a new one with the desired data storage location.
Here’s what you need to know to ensure a smooth switch:
Your integrations will be paused. After the switch is complete, you’ll need to manually unpause the integrations you’d like to resume.
Some webhook data may be lost during this process. Due to their continuous, real-time nature, some webhook data may be lost during the switch.
Historical data from webhook-based integrations must be either manually backfilled or replayed. Some webhook providers - such as Segment - allow customers on certain plans to ‘replay’ their historical data. This feature varies from provider to provider and may not always be available.
If you don’t have the ability to replay historical webhook data, then it must be manually backfilled after the switch is complete.
We won’t delete or transfer any data from your current BigQuery destination. Depending on your needs, how historical data is imported into the new instance may vary. See the next section for details.
Step 1: Select a historical data setting
If you need historical data in the new destination, you’ll need to either manually transfer your historical data or queue a historical replication job.
Manually transfer historical data: To accomplish this, you can use Google’s UI to import the existing datasets into the new BigQuery instance. Ensure that all dataset names remain the same during the transfer, or loading errors may occur.
Complete the historical transfer before continuing.
Queue a historical replication job: Initializing a historical replication will re-replicate all historical data from your integrations. For SaaS integrations, Stitch will replicate data beginning with the Start Date currently in the integration’s settings.
Define the setting
- From the Stitch Dashboard, click the Destination tab.
- At the bottom of the page, click the Change Destination button.
In the Historical Data section, select how you want data to be replicated to the new destination:
Replicate new data only: Select this option if you manually transferred your historical data or you don’t need to re-replicate historical data. Stitch will pick up where it left off and only replicate new data to the new BigQuery destination.
Replicate historical data: Stitch will clear all Replication Key values, queue a full re-replication of your integrations’ data, and replicate all historical data to your new destination. For SaaS integrations, Stitch will replicate data beginning with the Start Date currently listed in the integration’s settings.
Step 2: Delete the current destination
- Click Continue. A message will display asking you to confirm the removal of the current destination’s settings.
To complete the switch, Stitch must delete your current destination configuration. Note: This will not delete data in the destination itself - it only clears this destination’s settings from Stitch.
To continue with the switch, click OK to delete the current destination settings.
Step 3: Connect the new BigQuery destination
Important: Requirements for connecting BigQuery:
A user with full access to an existing Google Cloud Platform (GCP) project within BigQuery.
Admin permissions for BigQuery and Google Cloud Storage (GCS). This includes the BigQuery Admin and Storage Admin permissions. Stitch requires these permissions to create and use a GCS bucket to load replicated data into BigQuery.
Access to a project where billing is enabled and a credit card is attached. Even if you’re using BigQuery’s free trial, billing must still be enabled for Stitch to load data.
- On the next page in Stitch, click the BigQuery icon.
- Click Sign in with Google.
- If you aren’t already signed into your Google account, you’ll be prompted for your credentials.
- After you sign in, you’ll see a list of the permissions requested by Stitch:
- Read/Write Access to Google Cloud Storage - Stitch requires Read/Write access to create and use a GCS bucket to load replicated data into BigQuery.
- Full Access to BigQuery - Stitch requires full access to be able to create datasets and load data into BigQuery.
- Read-Only Access to Projects - Stitch requires read-only access to projects to allow you to select a project to use during the BigQuery setup process.
- Basic Profile Information - Stitch uses your basic profile info to retrieve your user ID.
- Offline Access - To continuously load data, Stitch requires offline access. This allows the authorization token generated during setup process to be used for more than an hour after the initial authentication takes place.
- To grant access, click the Authorize button.
- After you sign into Google and grant Stitch access, you’ll be redirected back to Stitch.
Fill in the fields that display:
- Google Cloud Project: From the dropdown, select the project you want to connect to Stitch.
- Google Cloud Storage Location: From the dropdown, select the new location you want to use.
- Click Finish Setup.
Step 4: Delete the existing Google Cloud Storage bucket
To ensure Stitch can load data into the new BigQuery instance, you’ll need to delete the Google Cloud Storage (GCS) bucket that’s attached to the project.
- In another browser tab, log into the Google console.
- Use Google’s instructions to locate the bucket in Google’s UI.
- Locate the Stitch bucket. If you’re unsure of which bucket belongs to Stitch, reach out to Stitch support.
- Delete the bucket.
Step 5: Unpause integrations
After you’ve successfully connected the new BigQuery destination and deleted the original Stitch GCS bucket, un-pause your integrations in Stitch.
Your data will begin replicating according to the historical data option selected in Step 1.