Using SQL to uncover insights from your Salesforce data

Salesforce is one of the most popular customer relation management tools in the world, sporting a client list of over 100K. It’s listed on the NYSE under the symbol CRM. So if so many companies are using it, and its information is so core to a company’s business, why is its data so difficult to work with?

This was the topic of yesterday’s webinar, led by David Wallace, Sr. BI Analyst at RJMetrics, and Benn Stancil, Chief Analyst at Mode. David and Benn explained in detail the different methods for extracting Salesforce data, the quickest way to start running SQL queries on that raw data, and took us through examples of the rich insights you can uncover when you have the ability to join CRM data with other data sources.

You can read the recap below, or use the links at the bottom of the page to download the presentation deck, or watch the full video.

The 3 analysis options for Salesforce data

In working with Salesforce data, there are three primary methods for analysis: Salesforce reports, data export, and sending the data to a central warehouse.

1. Limitations of Salesforce Reports

The most popular of these methods is using the out-of-the-box analytics functionality of reports, done directly in Salesforce.

While the reporting functionality certainly has its uses (displaying opportunities by quarter or showing the current sales pipeline are extremely simple) when a company graduates to more complex analysis, Salesforce reports cease to be a viable option.

And the reason for this is simple: the main value prop of Salesforce isn’t to help you answer complex business questions or discover trends, it’s to act as a robust CRM tool. Salesforce itself is even extremely transparent about its analytical shortcomings, providing a document that contains all of the limitations of the current reporting functionality.

And not only are reporting possibilities limited, but when data lives exclusively in Salesforce, teams that don’t have Salesforce accounts can’t access the information, meaning you lose the ability to tie this data to the rest of your company’s data sources. Without the ability to incorporate data from different sources, you fail to obtain the bigger insight you’re looking for.

2. Exporting Salesforce data

If companies can’t do the analysis they’d like within Salesforce, they often opt to export those reports as CSV files that are opened in a spreadsheet application. While this method certainly opens up the possibilities of analysis, it doesn’t scale well. At a certain size of data or a certain amount of VLOOKUPs, Excel starts to slow down and crash.


Also, data exports take time; sometimes a whole day. Dedicating an employee to constantly running reports takes them off something else more worthwhile.

3. Sending your data to a central warehouse

The third, and most efficient way to analyze Salesforce data is by sending it to a centralized data warehouse and running queries directly on that warehouse.

Sending Salesforce data to a central warehouse

This approach solves a few problems:

  1. It opens up infinite possibilities for analysis — The limitations of both Salesforce and spreadsheets don’t exist when querying a database. Your analysis is only limited to your creativity.

  2. It takes data out of silos — This allows you to begin to tie all of your data sources together to power more complex analytics.

  3. It saves employee time — That employee who was tasked with exporting Salesforce reports and analyzing data all day in spreadsheets can spend their time on more high-value data work.

There are two ways you can send your data to a data warehouse: build your own data infrastructure, or purchase a product that will do it for you. Many companies have chosen “build” over “buy,” but more often than not, they underestimate the amount of effort and resources needed to set up a big data stack, and perhaps more importantly, maintain it.

In our work building data pipelines, and talking to other people who have done this, we’ve identified seven core challenges of building data infrastructure:

  1. Connections — Every API is unique and has its own challenges to develop for.

  2. Accuracy — Minor slipups in data processing or schema evolution can easily lead to bad data.

  3. Latency — Many data sources are optimized for receiving data in chunks rather than a stream. Solving these problems means increased build time and maintenance risk.

  4. Scalability — Your pipeline needs to be able to scale to accommodate for dramatic increases in data, otherwise you’ll quickly outgrow it and will need to rebuild.

  5. Flexibility — If you’re not prepared to accommodate different data types, and quickly address changes as they happen, your pipeline will break.

  6. Monitoring — When data starts flowing from such a large number of data sources, failures are inevitable. Teams have to dedicate engineering efforts just to build monitoring tools.

  7. Maintenance — The data pipeline project doesn’t typically have a real end. All of the other factors are often only solved through an ongoing, dedicated maintenance team.

If you want to go deeper on any of these challenges, check out this blog post. If you’d like to avoid this headache and get the benefits of having your Salesforce data in a data warehouse immediately, try Stitch.

6 examples of CRM analysis with SQL

Once a company pipes their data into a data warehouse like Amazon Redshift, they can then set up Mode on top of Redshift to run SQL queries and visualize those results.

Data warehouse to Mode

To understand some of the capabilities of SQL, it’s often easiest to simply look examples of companies running analysis and the actionable insights they find.

1. Sales cycles

Let’s say a company needs to see how much new business they close each month, but sales also wants to see how long it took to close the deals and would like to know if their sales cycles are getting shorter or longer.

The first of these analyses are easy to run in Salesforce by using filters to show only “new business” opportunities with a “closed won” stage, but once you try to generate cohorts by the amount of time it took to close each deal, the reporting tool doesn’t support that analysis. When using SQL, this analysis can be run by simply adding an additional query row.

SQL query

The second line on the right using the DATEDIFF query calculates the time between when and opportunity was created and closed, which allows you to cohort my opportunities by age. The result is a chart that shows the deals cohorted by length.

Mode chart

From this report, we can quickly see that a lot of deals closed quickly in September and October. And as before, this is all live against the database.

2. Week-over-week performance

Below are two Mode charts we use to analyze our own data. The first measures the top of the funnel sales activity. This chart allows us to quickly observe how the number of calls our account development representatives place in a given day differs week over week, allowing us to identify shortcomings as they happen and to act on them before it really becomes too late.

Mode chart

3. Measuring performance against goals

The second chart goes one step further down the sales funnel, tracking our opportunity created goal progress.

Mode chart

The additional “goal” layer we added here makes this data really valuable in accurately measuring our sales efforts.

4. Prioritizing clients by product usage

Probably the biggest benefit of using a centralized database is the ability to run analysis with data from multiple sources. The chart below looks at the number of times each client logs in, the date they last logged in, then sees whether they are being contacted enough based on their activity.

Mode report

The chart shows that overall, the sales team is doing a good job contacting the top two users. However, the most contacted user, Christopher, is far from the most active. This isn’t necessarily wrong, but it shows that in this account, we might have an opportunity to talk to other champions like James. We can also see that Matthew used to be a champion, but hasn’t logged in a month. That might be worth investigating too.

5. Funnel analysis

Below is another chart we use daily; it shows the number of clients at each stage in our signup process, and how the conversion rate differs from stage to stage.

Mode graph

This data and visualization allows our team to see which steps are acting as bottlenecks in the process, and that was only possible by combining a variety of different data sources, including Redshift via Snowplow, Pardot, and various MySQL and Postgres databases.

6. Analyzing your self-service score

This last example doesn’t actually use Salesforce data, but instead combines data from Zendesk and Snowplow to calculate the ratio of the unique people visiting our help center content and attempting to service themselves to the unique number of people filing support tickets.

Mode chart

This chart is a fantastic indicator of how effective our help center is in getting users to find solutions themselves instead of filing support tickets.

Keep learning

If you’re interested in hearing a more in-depth analysis of how Stitch and Mode use Salesforce data to uncover insights, you can download the presentation slide deck, or watch the full recording below:

Salesforce & SQL: Get More from Your CRM Data Using the Tools You Love

To sign up for a free 14-day trial of Stitch, head over to the Stitch website.