How NewsCred built its analytics stack

NewsCred, a content marketing company, has a team of 200 people in seven offices around the world. Its client base spans 70 countries, and includes Pepsi, Visa, Dell, and HP. And its platform is designed to serve the needs of a global content marketing team.

Like many startups, Newscred takes pride in having a lean engineering team. All of its engineers in both New York City and Dhaka, Bangladesh, are 100% dedicated to building the product. That’s why, when it became apparent last year that Newscred needed data on marketing and product performance, it was never an option for the tech team to take on a data engineering project.

At the time, Product Manager Tom Lowe had never built an analytics stack, but he recognized that the number one priority was self-service — every team at the company needed to have access to data without any barriers. Tom set out to build an internal analytics platform that would deliver these insights without substantial engineering time, and without major distraction from his core product job. Here’s what he learned.

Three approaches to building an analytics stack

1. Out-of-the-box business intelligence

The NewsCred team was already set up with a point-and-click business intelligence tool. The tool was simple to set up, and promised to aggregate and visualize all of the company’s data, but as the questions became more complex, he ran into limitations. “We needed to monitor median activity metrics across clients and correlate product activity with commercial information like the industry of the client, how much they’re paying us, and what products they’ve purchased.”

The questions were just too difficult for an out-of-the box solution. Tom had to go outside the tool and spend hours downloading CSV files, managing Google spreadsheets, and running far too many vlookups. An out-of-the-box business intelligence tool just wasn’t flexible enough to do the type of analysis NewsCred needed.

2. Build a data warehouse with bespoke scripts

At this point, NewsCred’s CTO said it was time to stop relying on other people to store NewsCred's data — it was time to take ownership of their data by building a data warehouse. Having data centralized in a data warehouse was the only way to get the control they needed. After considering their options, they chose Amazon Redshift.

Redshift is perfect for NewsCred’s use case; it’s secure, stable, and optimized for analyzing massive amounts of data. “As soon as we had our data in Redshift, we were able to start writing the SQL for complex queries that we couldn’t answer with a basic business intelligence platform,” Tom said. “We were exploring product questions like how long does it take for a client to go from first login to hitting publish? Which areas of the product is a client most invested in? Then we were joining these user activity metrics to Salesforce data to analyze behavior by account.”

Of course, tapping into the analytical powers of Redshift requires actually getting your data into Redshift. Tom solved this problem by writing numerous scripts to connect the data sources to the data warehouse. It didn’t take long for the cracks to start appearing. “We had automated so much, but I was still having issues with my scripts. They were taking too long, and would crash. We didn’t have the data engineering resources to build set-and-forget solutions.” On top of that, Tom was creating a bottleneck. Every time a script broke, someone was knocking on his door to fix it.

3. Build a data warehouse with Stitch

Tom’s third (and final) approach to solving this problem was to own data management, but to replace his custom data integration scripts with RJMetrics Pipeline, the platform that became Stitch. Stitch connected to NewsCred’s existing data sources and began streaming that data to Redshift, essentially replacing Tom’s self-described “sketchy code.”

“I set up Stitch really quickly,” Tom said. “Our devops team whitelisted an IP address, then I set up my credentials to Salesforce and Zendesk. Within an hour we were streaming all of our data into Amazon Redshift.”

The analytics layer

To visualize the data in Amazon Redshift, NewsCred uses Periscope Data. “Periscope is great because you write a SQL framework and then you’re able to use an interface on top of that.” This functionality makes it easy for an analyst like Tom to do the heavy lifting, but then hand it over to a business user to do additional filtering or data aggregation.

With this analytics stack in place, Tom was finally able to go back to his real job — building the NewsCred product. He lists three core benefits of this analytics stack:

  1. Automated reporting: Tom sends out a weekly all-company email that tracks high-level KPIs related to product usage, which he says gives his team the ability to see whether their customers are behaving as they expect.

  2. Data independence: Redshift plays beautifully with SQL, a language that can be picked up by anyone with general analytics skills. This provides Tom’s team with a high degree of what Tom calls data independence. He spends some time onboarding people to his process, but he’s not a data gatekeeper. “Anybody can get the information that they need without going through anybody else. As soon as you get between the person who needs the information and the information, then you’ve got inefficiency.”

  3. No strain on developer resources: Best of all, this requires zero dev time. Any analyst can start building reports and visualizations on top of the data without waiting for an engineer. “Since we got all of our data into Redshift, it’s been gravy.”

Tom set out with a simple goal: get everybody in the organization the information they need, without needing to invest time that could be better spent working directly on the product. His experience is a roadmap for anyone looking to solve a similar problem. “If I had known how easy it was going to be to get Stitch pushing my data to Redshift, I wouldn’t have spent all that time writing scripts. I would have set this up earlier.”

Sign up for Stitch for free, and you can be syncing all of your data sources to Redshift or your favorite data warehouse in minutes.

Image credit: [Alan O'Rourke](