PostgreSQL and MySQL are the two most popular databases that have come out of the open source community. PostgreSQL is a highly scalable open source, object-relational database management system (RDBMS) that's designed to handle a range of operations and perform well with complex datasets. MySQL, which comes in both open source and proprietary versions, excels in read-heavy operations, and provides support for multiple storage engines.
Both systems have their benefits. Which is the better choice for your organization? The answer depends on your specific requirements and goals. Let’s compare PostgreSQL and MySQL in performance, architecture, scalability, and other areas.
|Extensible object-relational database management system; multiprocess
|Relational database management system; single process
|High-performance in situations requiring high read and write speeds
|High-performance in web-based applications and situations requiring high read speeds
|Only certain storage engines are ACID-compliant
|Native SSL support
|Largely SQL-conformant; meets 160 out of 179 mandatory features
|Programming language support
|Written in C and C++; supports C/C++, R, Delphi, Java, Go, Lisp, Erlang, PHP, Perl, and Node.js
|Asynchronous, cascade, and synchronous
|Asynchronous, synchronous, and semisynchronous
|Primary and secondary configurations, synchronous replication, and clustering
|Primary and secondary configurations and clustering
|Active, open source community with access to free resources
|Vendor-backed community with access to free resources; additional support offered with paid versions
Let's look at each of these factors in detail.
The main differences between PostgreSQL and MySQL lies within each database's architecture. PostgreSQL is an object-relational database management system that emphasizes extensibility with easy integration of new data sources and extensions. MySQL is a conventional relational database that does not provide as strong of a level of support for new features, extensions, and data source integration as PostgreSQL.
Another significant difference revolves around the way each database runs on a server: PostgreSQL uses multiple processes, while MySQL runs in a single process.
PostgreSQL starts a new process with its own memory allocation for each connection it establishes, so it requires a lot of memory on systems with a high number of client connections. MySQL uses a single process and maintains one thread (or path of execution) per connection, which results in less memory pressure.
This difference in memory pressure becomes increasingly important when using PostgreSQL and MySQL at scale. Accounting for system performance with scale using PostgreSQL will require more capacity planning than with MySQL.
Databases use storage engines to read, write, and modify data. MySQL's default storage engine is InnoDB, which emphasizes ACID compliance and high performance. MySQL supports eight alternative storage engines:
MySQL developers can specify per table what storage engine to use.
PostgreSQL supports only one storage engine, but is developing a new engine called zheap that's designed to allow in-place updates and enable the reuse of space, to help control database bloat.
PostgreSQL and MySQL tend to demonstrate their best performance characteristics under different conditions.
PostgreSQL offers high speeds for both reading and writing. It provides concurrency without read/write locks, which allows PostgreSQL to process and adjust multiple read and write threads concurrently.
Overall, PostgreSQL's scalability and high performance makes it valuable in large, complicated database projects in which high read and write speeds are essential. When properly architected, PostgreSQL doesn't have many performance limitations.
MySQL excels in web-based applications because it performs well in situations that require high read speeds. Since MySQL tends to have performance issues when dealing with complicated queries, it's better suited for high-concurrency situations where the jobs are read-intensive. MySQL is also beneficial when used with business intelligence applications, which require more reading than writing. However, the database is not ideal as a platform for BI or data analytics because it lacks a columnar data store, a storage architecture that allows for efficient analytical querying.
ACID stands for atomicity, consistency, isolation, and durability, which are properties attributed to database transactions.
ACID compliance is a safeguard against losing or miscommunicating data across a system. PostgreSQL is ACID-compliant by design, whereas MySQL supports ACID operations in its default storage engine, InnoDB, as well as NDB Cluster.
Both PostgreSQL and MySQL use access control lists (ACL) for user permissions for databases, schemas, and tables. Both support full-disk, file-based, table-level, column-level, and instance-level encryption for data at rest.
For data in motion, PostgreSQL offers native SSL support for encrypting communication between servers via different connections, while MySQL uses TLS, which it claims provides stronger encryption algorithms.
The American National Standards Institute (ANSI) maintains the SQL language standard, and updates it every few years. The current standard is SQL:2016, and it sets out functions and syntax that every SQL dialect should provide. No version of SQL is 100% ANSI-standard, but some are closer than others. The closer a database follows the SQL standard, the easier it is for developers familiar with other DBMSes to become proficient in it.
PostgreSQL is largely SQL-conformant. Out of 179 mandatory features, PostgreSQL meets 160. MySQL is only partially conformant. Developers at Oracle, which owns MySQL, are working toward a goal of full SQL compliance.
Replication is the process of copying data from one computer or server to another, or writing data to multiple servers simultaneously. PostgreSQL and MySQL handle this process differently.
PostgreSQL can perform asynchronous, cascade, and synchronous replication, depending on what the use case requires. Asynchronous replication allows users to create read-only copies from a master database to establish a failsafe backup and distribute workloads. Cascade replication allows a master database to replicate to multiple slave databases. With synchronous replication, data written to a master database is copied automatically to slave databases and all instances perform transactions simultaneously, which reduces the risk of losing a commit.
PostgreSQL also offers single master to one standby, single master to multiple standbys, bidirectional replication, and logical log streaming replication.
MySQL relies on one-way asynchronous replication to copy data between databases. It also supports a single master instance and multiple slave instances, circular replication, and master-to-master replication. Replication might involve data from a single table, data from only certain databases, or data from all available databases.
Clustering is a process that uses shared storage to replicate the same set of data to all nodes in an environment. The main objective of using clustering is to ensure fault tolerance, a database’s ability to continue providing uninterrupted service in the event of a failure in the system.
Both PostgreSQL and MySQL support both replication and clustering to ensure high availability and fault tolerance. In particular, PostgreSQL supports synchronous or streaming replications as well as Postgres-XL — a database clustering environment. MySQL supports semisynchronous and synchronous replication in a clustering environment as well as MySQL Cluster — a multimaster technology that focuses on linear scaling.
Both PostgreSQL and MySQL boast thriving, active communities that provide support to users and developers. A number of third-party organizations also provide paid professional services for both databases.
PostgreSQL provides user support through technical documentation and manuals, mailing lists, a dedicated wiki page, and IRC and Slack channels. Similarly, MySQL offers community support through IRC channels, forums, mailing lists, documentation and manuals, tutorials, and a free-to-all bug database. MySQL users with a paid version of the database have access to free consultations and 24/7 support.
Organizations will want to consider all of these factors when deciding whether PostgreSQL or MySQL is better for their business and specific use case.
Whichever database you use, chances are you're going to want to replicate data from it to an enterprise data warehouse where it can be used for data analytics. That's the role of data replication tools such as Stitch.
Stitch reduces the time and complexity for replicating data from PostgreSQL and MySQL into a data warehouse, and enables businesses to kick-start their focus on insights. Set up a free trial in as little as five minutes and begin your journey to easier analytics today.