Introducing Singer — simple, composable, open source ETL

I’m happy to announce the release of Singer, an open source ETL project that optimizes for simplicity and composability.

Why we built Singer

Our customers use Stitch to replicate data from more than 50 different data sources and load it into Redshift, Google BigQuery, and Postgres data warehouses. While 50 may sound like a lot, we’re still just scratching the surface of the universe of tools that companies use. This past year, Chief Martec identified more than 3,800 of these tools and put them together in an impressively crowded infographic:

This is a lot of stuff

We’re adding integrations all of the time, but we still have customers who need data from sources that we don’t yet support. In fact, many have asked us if they can build their own integrations. With Singer, anyone can build a reusable integration and run it either on their own hardware or on Stitch — taking advantage of our monitoring, alerting, credential management, and autoscaling infrastructure. We’re launching today with live open source integrations for HubSpot, Marketo, Shippo, Close, SaaSquatch, Freshdesk, Braintree, GitLab, and Outbrain.

Singer is open source ETL, built for engineers

We built Singer because we believe that ETL should be reusable, both in Stitch and in other applications. From building more than 50 integrations, we’ve learned a lot about the right ways to interact with APIs and pitfalls to avoid. Rather than a monolithic core with plugins, each Singer component is designed to be run with minimal external dependencies. We follow the Unix philosophy of simple tools that can be composed together.

Singer’s core is its specification, which contains three parts:

  • Taps, which pull data out of data sources

  • Targets, which send data to its destination

  • A JSON-based format for communication between taps and targets

You can mix and match any tap with any target. While Python is our language of choice for writing integrations, programs written in any language can conform to the Singer specification. Today, Singer has 12 taps and three targets (CSV, Google Sheets, and Stitch). All of our future integration development will take place in Singer, and I hope you join us in extending that list.

The Singer community

Singer represents the result of a lot of hard work by the Stitch team, and we have been fortunate to get advice and help from many people outside our company as well. We want to say a special thank you to Maxime Beauchemin, Umur Cubukcu, Evan Kaplan, and Todd Persen, as well as the teams from Fishtown Analytics, Facet Interactive, HubSpot, Freshdesk, Braintree, GitLab, SaaSquatch, Shippo, and Close.

Give Singer a try, and we’d love to hear your feedback.