Stitch provides a Calm data pipeline
Calm promotes mindfulness primarily through a mobile app that provides guided and unguided meditation on things like reducing anxiety, improving focus, and building self-esteem. The app is free, but the company offers additional features via subscription. More than 26 million people have downloaded the Calm app.
The company is always looking for ways to improve its users' experience. "Our content library is always growing, and we can only fit so much on the screen," says Mark Marcantano, a Senior DevOps engineer at Calm. "We want to surface the content that's most relevant to our users. We tag all of our content into categories and topics, and track users' favorites in a PostgreSQL database."
When Marcantano joined Calm, the company's back-end engineering team was tasked with building data pipes from that database into an Amazon Redshift data warehouse so they could analyze the data to make better real-time recommendations for users' meditation material. "We want all of our data sources to replicate to the data warehouse as close to real-time as possible," he says.
"One of my first projects at Calm was writing a Python application to spin up an Amazon RDS instance, restore from backup, transform, and load snapshotted data into Redshift. That one project took about 30 hours of engineering time to get ready for production. We were looking to add many other data sources and services, so our small team had to find a better approach.
"We needed ETL pipelines, but we didn't have the bandwidth to dedicate someone to that full-time, as our user-facing projects are our highest priority. Our team decided to look at potential tools we could leverage to get the value we needed with less developer time.
"AWS recommended a cloud ETL platform to us, and we looked at it, but it required almost as much engineering work as writing our own custom jobs. It wasn’t a tool where we could have a non-engineer safely set up or modify a pipeline.
"Luckily I was already familiar with Stitch. I first used it when it was still called RJMetrics Pipeline, and I thought it was kind of magic how it all just worked. Then, I was using it to replicate our main application data from MongoDB to Redshift so our analytics team could access data via SQL. We didn't have to redefine the schema to get well-structured data in our data warehouse, which was a very big win for our small team."
I thought it was kind of magic how it all just worked.
Senior DevOps Engineer
"When I got the go-ahead to trial Stitch at Calm, it took only a few days before the rest of the team saw why I had pushed to use it. We got our data loaded into Redshift and were able to build out the recommendations service more quickly than we had planned. Unlike the other ETL tool we tried, I could give Stitch credentials to our user acquisition director, and he could set up all of the integrations and fields he needed in a matter of hours. Writing custom code to do all of that would have taken up a month's worth of back-end engineering time. Using Stitch allowed us to move forward without having to hire another person."
Having a reliable cloud ETL solution will also help Calm reduce costs. "We haven't retired all of our custom jobs," Marcantano says, but if and when they do, "Infrastructure costs will go down. The extra Amazon EC2 instances that we spin up every night will go away."
Stitch is helping in other ways as well. "We're looking to optimize our support workflow," Marcantano says. "Already our support lead has been able to pull some of the ticket and response data we have in Zendesk into Mode reports to get better insights on where we can improve our support flow. We're hoping to leverage customer touchpoint data to improve user experience across the board. Stitch pushed that project forward six months compared to when our support lead expected it.
"Other teams are now able to get reports set up in Mode and other tools, creating new opportunities for data-driven decision making."
Looking ahead, Marcantano expects Calm to apply machine learning techniques to user-event data, factoring in activity not just from PostgreSQL and Zendesk, but also additional data sources such as email via the Iterable SaaS marketing platform.
Unlike the other ETL tool we tried, I could give Stitch credentials to our user acquisition manager, and he could set up all of the integrations he needed in a matter of hours. Writing custom code to do all of that would have taken up a month's worth of back-end engineering time. Using Stitch allowed us to move forward without having to hire another person.
Senior DevOps Engineer