How do you begin combining data from cloud applications with your internal databases to gain insight into your business? In this tutorial, we'll show you how the Chartio business intelligence (BI) platform can help you deliver business insights.
To analyze data from diverse sources, you need a data warehouse that consolidates all of your data in a single location. Most businesses take advantage of cloud data warehouses such as Amazon Redshift, Panoply, Google BigQuery, or Snowflake.
You can extract data from SaaS applications and databases and load it into the data warehouse using an ETL (extract, transform, load) tool. Once the data has been migrated, data analysts can use it to create reports.
Getting started with Chartio's Automated Cloud Data Stack
If you don't already have a data warehouse, one of the best ways to get started is with Chartio's Automated Cloud Data Stack (ACDS), which wraps together three self-service data analytics platforms – Chartio, Stitch, and Panoply.
- Chartio is a cloud-based analytics platform that you access through a browser. It lets you explore and visualize your data through a GUI query tool. You can build more than a dozen kinds of charts and save them in shareable dashboards, or you can use SQL to create more complex visualizations, as we'll do here. Chartio offers native connectivity to more than 20 data sources.
- Stitch is a simple, powerful ETL service for businesses of all sizes, up to and including the enterprise. It can move data from dozens of data sources to several different data warehouses. If your organization already uses a data warehouse, you can sign up and be moving data to a data warehouse in five minutes.
- If you don't already have a data warehouse, ACDS makes it simple to get started with Panoply, a self-optimizing petabyte-scale cloud data warehouse.
Setting up your data warehouse
I used some of Stitch's real data to build a data visualization for this post. Specifically, I was curious to see how many support conversations came from each of our pricing tiers. To find out, I needed to use billing data stored in Salesforce and join it with support conversation data from Intercom to create a visualization of support conversations.
We already had a Google BigQuery data warehouse, so my first step was to set up BigQuery to hold the data I'd be replicating from Salesforce and Intercom. Following our documentation, I added to my Google user the permissions that Stitch would need to load data to BigQuery.
Using Stitch makes extracting data from a source and loading it into a data warehouse easy. I logged in to Stitch and added new integrations for Salesforce and Intercom, following the instructions in our documentation. From Salesforce I selected the Account table to replicate. Stitch's Intercom integration automatically extracts data from all available tables. From both, I chose the period from which to replicate data.
Once I'd set up my integrations in Stitch, I added BigQuery as my destination and specified a BigQuery project name. Within a few minutes, Stitch extracted the data I specified and loaded it into BigQuery.
Building a sample report
Next I set up Chartio to recognize my BigQuery data warehouse. I clicked on Data Sources, then Add a Data Source, and chose Google BigQuery. Chartio requires you to provide an authorization JSON file, and gives you a link to documentation on how to generate one from BigQuery. Enter an alias by which Chartio can refer to your data source, then click Submit.
Chartio then loads the schemas for your data source and lets you deselect any you don't care about. Click Apply when you're done.
Next, Chartio asks you to set up relationships between your schemas by specifying which columns are foreign keys. Once you've set them up, Chartio will be able to autogenerate join paths between tables when you build reports in Interactive Mode. You can click the Connect Tables button to see which fields Chartio thinks should be foreign keys. You can also set them yourself by clicking on a field name in schema. You'll need admin permissions to edit a data source's schema.
Clicking on a schema name brings up a list of its tables, and clicking on a table brings up a list of its columns, grouped into measures and dimensions. I didn't need all the tables in my data sources for this report, nor all the columns in each table, so I went through the schemas, deleting tables I wasn't interested in and toggling the visibility of columns, so I could work with just the data I cared about.
Now my data was in good shape for me to work with. I clicked on the Create a Chart button, which takes you into Chartio's Explore section and its Interactive Mode tab.
Chartio makes it easy to generate a simple chart based on a single schema by dragging the fields that you want to use as measures, dimensions, and filters into the workspace, then clicking Run Query. You can click a chart icon on the right side of the screen to specify the kind of visualization you want.
But I really wanted a more complex visualization – one that involved joining multiple tables, with multiple WHERE clauses. For that, Chartio offers SQL Mode. I spent time in the BigQuery query editing putting together the syntax that gave me exactly the data I wanted, which comprises columns from multiple tables in two different schemas, and a couple of defined fields. Chartio lets you add custom columns to tables, but even better, it lets you add whole custom tables. I clicked on the Add Custom Table button and defined a custom table in SQL to hold joined Salesforce and Intercom data. The SQL code selects only companies we have IDs for (thus not prospects) and that are not marked as deleted (thus current customers). It also consolidates clients on annual and monthly billing programs in each of our plans into a single number for each tier:
userco AS ( SELECT users.id AS user_id, company.id AS company_id, company.company_id AS stitch_id FROM `reporting01-216119.intercomintegration.users` AS users, UNNEST(companies.companies) AS company WHERE company.company_id IS NOT NULL), sf AS ( SELECT CAST(cid__c AS STRING) AS stitchid, case when tier__c LIKE '%Starter%' then 'Starter' when tier__c LIKE '%Basic%' then 'Basic' when tier__c LIKE '%Premier%' then 'Premier' when tier__c LIKE '%Custom%' then 'Enterprise' else tier__c end as tier FROM `reporting01-216119.salesforceintegration.Account` AS account WHERE account.isdeleted IS FALSE AND tier__c IS NOT NULL) SELECT conversations.id AS convo_id, count(conversations.id) AS convo_count, conversations.user.id AS convo_user_id, userco.user_id, userco.company_id, userco.stitch_id, tier FROM `reporting01-216119.intercomintegration.conversations` AS conversations JOIN userco ON conversations.user.id = userco.user_id JOIN sf ON stitchid = stitch_id WHERE conversations.user.type != "lead" GROUP BY tier, convo_id, convo_user_id, user_id, company_id, stitch_id ORDER BY convo_count
This created a table that looks like this:
This temporary table did all the heavy lifting for me. Now I could just drag the column representing conversation count onto the Measures bucket and the tier column onto Dimensions, and I specified the measure should be presented in descending order:
This chart showed the data I wanted to represent. Chartio lets you customize your visualization, and specify linked dashboards so you can drill down from one visualization into another. When you're satisfied, you can save a chart to a dashboard that other users can share.
Put Chartio and Stitch to work
So there you have it – a quick walk through the process of using an ETL tool like Stitch to move data from multiple sources into a data warehouse, then report on it using Chartio. Sign up for a free trial of Stitch and start creating your own BI reports.