Among modern cloud data warehouse platforms, Amazon Redshift and Microsoft Azure Synapse Analytics have a lot in common, including columnar storage and massively parallel processing (MPP) architecture. But each has unique features that could make it better suited to a particular organization's data analytics infrastructure.
Considering key differentiating factors can help you determine whether Redshift or Azure Synapse Analytics is a better data warehouse for your business. Here we compare these two cloud data destinations along several dimensions:
Redshift populates its clusters with nodes — configurations that bundle together CPU, memory, storage and IOPS. Redshift offers three types of on-demand nodes with different levels of performance at prices that range from $0.24 to $13.04 per hour.
AWS has a couple of other pricing options for Redshift. You can take advantage of managed storage, for which it charges per terabyte per month. Rates vary depending on the AWS region in which Redshift runs, with the lowest being $24 per terabyte per month. Managed storage is available only with one kind of node type. And you can get reserved instances, which are nodes you pay for regardless of whether you're using them, albeit at a lower rate than on-demand pricing. They're appropriate if you have predictible ongoing workloads.
In contrast to AWS, Microsoft prices compute and storage resources separately. Its equivalent of Redshift's nodes are data warehouse units (DWU), which comprise CPU, memory, and IOPS but not storage. It offers a wide variety of DWUs at prices that range from $1.20 to $360 per hour.
Data storage is charged at the rate of $122.88 per terabyte per month.
Neither AWS nor Microsoft charges for data scanned by queries.
Get started now
Free 14-day trial. No credit card required
### Performance Thanks to their ability to scale up and down, both Redshift and Azure Synapse Analytics perform well under various load levels. You should run benchmarks using your own data, but you'll likely find that both platforms can handle most companies' workloads with excellent performance. ### Administration, management, maintenance Redshift and Azure Synapse Analytics both require a reasonable amount of attention on the part of administrators. Redshift data warehouses are made up of computing resources called nodes, which are organized into a group called a cluster. Each cluster runs an Amazon Redshift engine and contains one or more databases. Administrators can run commands to modify and tune clusters. Scaling the number of nodes up or down requires administrator action. For Microsoft's part, while other Azure services can be set up to autoscale, scaling an Azure Synapse Analytics data warehouse requires administrator attention. Administrators can also partition data structures to improve performance and do other kinds of performance optimization. ### Data protection To save data in case of accidental deletion, Redshift automatically takes incremental snapshots that track changes to the cluster since the previous automated snapshot, and you can also take manual snapshots. By default Redshift takes a snapshot about every eight hours or following every 5 GB per node of data changes, or whichever comes first, but administrators have control over scheduling. Redshift stores snapshots internally in Amazon S3 by using an encrypted Secure Sockets Layer (SSL) connection. Amazon provides free storage for snapshots in an amount equal to the storage capacity of the backed-up cluster. If you reach the free snapshot storage limit, you incur charges for additional storage at your normal rate. You can set the retention period for both automated and manual snapshots. If you need to restore a cluster from a snapshot, Redshift creates a new cluster, then restores all the databases from the snapshot data. The process requires a fair degree of manual effort. Azure Synapse takes automatic snapshots of the data warehouse throughout the day to create restore points that are available for seven days. You can also manually trigger as many as 42 user-defined snapshots. Snapshot storage counts toward storage allotment for billing purposes. You can restore the data warehouse from any snapshot by issuing a restore command.
Both Redshift and Azure Synapse Analytics use AES encryption on data at rest, and support customer-managed keys. Neither turns on encryption by default. Both rely on roles for providing access to resources.
For authentication, AWS allows federated user access via AWS Directory Service, while Azure Synapse Analytics can integrate with Azure Active Directory. Both support multifactor authentication (MFA). Azure Synapse offers OAuth 2 for authorized account access without sharing or storing user login credentials; Redshift lacks OAuth support.
In Redshift, permissions apply to tables as a whole. Azure Synapse Analytics supports granular permissions on schemas, tables, views, individual columns, procedures, and other objects.
Both data warehouses also provide some measure of network security. AWS lets you launch a Redshift cluster in an Amazon Virtual Private Cloud (VPC). Microsoft offers a similar approach with what it calls virtual networks.
Redshift and Azure Synapse Analytics satisfy compliance requirements for HIPAA, ISO 27001, PCI DSS, SOC 1 Type II, and SOC 2 Type II, among others.
Overall, both Redshift and Azure Synapse Analytics have a lot going for them. You should do testing with your own data — ingesting data, running reports — to determine which cloud data warehouse better suits your organization. Opting for one over the other involves identifying which solution makes the most sense for your data strategy. Like most modern cloud data warehouse platforms, Azure Synapse and Azure Synapse Analytics offer free trials and proof-of-concept support to help businesses get firsthand experience with the ways their solutions deliver value.
Successful businesses that depend on sound intelligence need a high-performing cloud data warehouse. On the road to better business intelligence, both Redshift and Azure Synapse Analytics are prime destinations. No matter which one you select as your data warehouse, getting all of your organization's data ingested is critical to providing the background you need for better business intelligence.
Stitch is already in the express lane with a simple, powerful approach to ETL that pulls your data from more than 100 different sources. Set up a free trial now.