Not long ago, setting up a data warehouse meant purchasing an expensive, specially designed hardware appliance and running it in your data center. In contrast, Snowflake is a data warehouse provided as software-as-a-service (SaaS). Snowflake former CEO Bob Muglia wrote in his corporate blog that his company actually is “disrupting the data warehouse industry on the way to enabling the data economy.”
So, what’s different about Snowflake?
What is a Snowflake data warehouse?
Snowflake is a data warehouse built on top of the Amazon Web Services or Microsoft Azure cloud infrastructure. There’s no hardware or software to select, install, configure, or manage, so it’s ideal for organizations that don’t want to dedicate resources for setup, maintenance, and support of in-house servers. And data can be moved easily into Snowflake using an ETL solution like Stitch.
But what sets Snowflake apart is its architecture and data sharing capabilities. The Snowflake architecture allows storage and compute to scale independently, so customers can use and pay for storage and computation separately. And the sharing functionality makes it easy for organizations to quickly share governed and secure data in real time.
Try Stitch for Snowflake for free for 14 days
- Unlimited data volume during trial
- Set up in minutes
Snowflake architecture: the real differentiator
Remember when purchasing a cable television service meant the infrastructure and the content were a package deal? Today, those things are distinct (but integrated) and, for the most part, people have more control over what they use and how they pay for it.
Snowflake’s architecture allows similar flexibility with big data. Snowflake decouples the storage and compute functions, which means organizations that have high storage demands but less need for CPU cycles, or vice versa, don’t have to pay for an integrated bundle that requires them to pay for both. Users can scale up or down as needed and pay for only the resources they use. Storage is billed by terabytes stored per month, and computation is billed on a per-second basis.
In fact, the Snowflake architecture consists of three layers, each of which is independently scalable: storage, compute, and services.
The database storage layer holds all data loaded into Snowflake, including structured and semistructured data. Snowflake automatically manages all aspects of how the data is stored: organization, file size, structure, compression, metadata, and statistics. This storage layer runs independently of compute resources.
The compute layer is made up of virtual warehouses that execute data processing tasks required for queries. Each virtual warehouse (or cluster) can access all the data in the storage layer, then work independently, so the warehouses do not share, or compete for, compute resources. This enables nondisruptive, automatic scaling, which means that while queries are running, compute resources can scale without the need to redistribute or rebalance the data in the storage layer.
The cloud services layer uses ANSI SQL and coordinates the entire system. It eliminates the need for manual data warehouse management and tuning. Services in this layer include:
- Infrastructure management
- Metadata management
- Query parsing and optimization
- Access control
5 Snowflake benefits for your business
Snowflake is built specifically for the cloud, and it’s designed to address many of the problems found in older hardware-based data warehouses, such as limited scalability, data transformation issues, and delays or failures due to high query volumes. Here are five ways a Snowflake data warehouse can benefit your business.
Performance and speed
The elastic nature of the cloud means if you want to load data faster, or run a high volume of queries, you can scale up your virtual warehouse to take advantage of extra compute resources. Afterward, you can scale down the virtual warehouse and pay for only the time you used.
Storage and support for structured and semistructured data
You can combine structured and semistructured data for analysis and load it into the cloud database without the need for conversion or transformation into a fixed relational schema first. Snowflake automatically optimizes how the data is stored and queried.
Concurrency and accessibility
With a traditional data warehouse and a large number of users or use cases, you could experience concurrency issues (such as delays or failures) when too many queries compete for resources.
Snowflake addresses concurrency issues with its unique multicluster architecture: Queries from one virtual warehouse never affect the queries from another, and each virtual warehouse can scale up or down as required. Data analysts and data scientists can get what they need, when they need it, without waiting for other loading and processing tasks to complete.
Seamless data sharing
Snowflake’s architecture enables data sharing among Snowflake users. It also allows organizations to seamlessly share data with any data consumer — whether they are a Snowflake customer or not — through reader accounts that can be created directly from the user interface. This functionality allows the provider to create and manage a Snowflake account for a consumer.
Availability and security
Snowflake is distributed across availability zones of the platform on which it runs — either AWS or Azure — and is designed to operate continuously and tolerate component and network failures with minimal impact to customers. It is SOC 2 Type II certified, and additional levels of security — such as support for PHI data for HIPAA customers, and encryption across all network communications — are available.
Connect your ecosystem
If you have a diverse data ecosystem or an IoT solutions database, you’ll want a cloud-based data warehouse that offers nearly infinite expansion, scalability, and ease of use. And you’ll need a data integration solution that is optimized for cloud operation. Using Stitch to extract and load data makes migration simple, and users can run transformations on data stored within Snowflake.
As a Snowflake Partner, Stitch makes it easy for you to set up your Snowflake data warehouse. New users get a free 14-day trial, during which you can move an unlimited amount of data from more than 90 data sources, including popular platforms such as Google Analytics and Google Ads, Shopify, Salesforce, and Stripe.
Further reading: Growing At 237%, Snowflake Says It’s Taking Business From Teradata and IBM