postgresql vs mysql - a comparison of two popular databases

PostgreSQL and MySQL are the two most popular databases that have come out of the open source community. PostgreSQL is a highly scalable open source, object-relational database management system (RDBMS) that's designed to handle a range of operations and perform well with complex datasets. MySQL, which comes in both open source and proprietary versions, excels in read-heavy operations, and provides support for multiple storage engines.

ETL data from 90+ sources to your data warehouse

Both systems have their benefits. Which is the better choice for your organization? The answer depends on your specific requirements and goals. Let’s compare PostgreSQL and MySQL in performance, architecture, scalability, and other areas.

PostgreSQL MySQL
Architecture Extensible object-relational database management system; multiprocess Relational database management system; single process
Performance High-performance in situations requiring high read and write speeds High-performance in web-based applications and situations requiring high read speeds
ACID compliance ACID compliant Only certain storage engines are ACID-compliant
Security Native SSL support TLS support
SQL:2016 conformance Largely SQL-conformant; meets 160 out of 179 mandatory features Partially SQL-conformant
Programming language support Written in C; supports C/C++, Python, JavaScript, R, Delphi, Java, Tcl, Go, Lisp, Erlang, and .Net Written in C and C++; supports C/C++, R, Delphi, Java, Go, Lisp, Erlang, PHP, Perl, and Node.js
Replication Asynchronous, cascade, and synchronous Asynchronous, synchronous, and semisynchronous
Fault tolerance Primary and secondary configurations, synchronous replication, and clustering Primary and secondary configurations and clustering
Community support Active, open source community with access to free resources Vendor-backed community with access to free resources; additional support offered with paid versions

 

Let's look at each of these factors in detail.

Architecture

The main differences between PostgreSQL and MySQL lies within each database's architecture. PostgreSQL is an object-relational database management system that emphasizes extensibility with easy integration of new data sources and extensions. MySQL is a conventional relational database that does not provide as strong of a level of support for new features, extensions, and data source integration as PostgreSQL.

Single process vs. multiprocess

Another significant difference revolves around the way each database runs on a server: PostgreSQL uses multiple processes, while MySQL runs in a single process.

PostgreSQL starts a new process with its own memory allocation for each connection it establishes, so it requires a lot of memory on systems with a high number of client connections. MySQL uses a single process and maintains one thread (or path of execution) per connection, which results in less memory pressure.

This difference in memory pressure becomes increasingly important when using PostgreSQL and MySQL at scale. Accounting for system performance with scale using PostgreSQL will require more capacity planning than with MySQL.

Storage engine support

Databases use storage engines to read, write, and modify data. MySQL's default storage engine is InnoDB, which emphasizes ACID compliance and high performance. MySQL supports eight alternative storage engines:

  • MyISAM: MyISAM is a fast storage engine that's optimized for use in read-heavy applications.
  • MEMORY: The MEMORY engine is used for temporary tables or staging areas that perform noncritical data operations such as caching and transient session management.
  • CSV: Because the CSV engine stores data as character- or comma-separated value text files, it's best used when data needs to be shared with other CSV-formatted applications.
  • ARCHIVE: The ARCHIVE engine’s purpose is to write and archive data.
  • BLACKHOLE: BLACKHOLE does not store any data it receives.
  • MERGE: The MERGE engine lets developers combine identical versions of MyISAM tables and treat them as a single entity, which can be useful for reporting.
  • EXAMPLE: This storage engine is a template with no built-in functionality that shows developers how to build a new customer storage engine.
  • NDB Cluster: The NDB Cluster storage engine is the best choice when working with a clustered environment for high redundancy and distributed computing.
  • FEDERATED: The FEDERATED storage engine enables databases to create one logical database from many physical servers, and execute queries on remote tables.

MySQL developers can specify per table what storage engine to use.

PostgreSQL supports only one storage engine, but is developing a new engine called zheap that's designed to allow in-place updates and enable the reuse of space, to help control database bloat.

Performance

PostgreSQL and MySQL tend to demonstrate their best performance characteristics under different conditions.

PostgreSQL offers high speeds for both reading and writing. It provides concurrency without read/write locks, which allows PostgreSQL to process and adjust multiple read and write threads concurrently.

Overall, PostgreSQL's scalability and high performance makes it valuable in large, complicated database projects in which high read and write speeds are essential. When properly architected, PostgreSQL doesn't have many performance limitations.

MySQL excels in web-based applications because it performs well in situations that require high read speeds. Since MySQL tends to have performance issues when dealing with complicated queries, it's better suited for high-concurrency situations where the jobs are read-intensive. MySQL is also beneficial when used with business intelligence applications, which require more reading than writing. However, the database is not ideal as a platform for BI or data analytics because it lacks a columnar data store, a storage architecture that allows for efficient analytical querying.

Get your data into a data warehouse today

ACID compliance

ACID stands for atomicity, consistency, isolation, and durability, which are properties attributed to database transactions.

  • Atomicity: A transaction is only completed when all of its components are successful.
  • Consistency: Data transactions are only completed when data meets specific validation rules.
  • Isolation: Multiple transactions can occur concurrently, but in isolation, to avoid negatively impacting the database.
  • Durability: Data is saved every time a transaction is complete.

ACID compliance is a safeguard against losing or miscommunicating data across a system. PostgreSQL is ACID-compliant by design, whereas MySQL supports ACID operations in its default storage engine, InnoDB, as well as NDB Cluster.

Security

Both PostgreSQL and MySQL use access control lists (ACL) for user permissions for databases, schemas, and tables. Both support full-disk, file-based, table-level, column-level, and instance-level encryption for data at rest.

For data in motion, PostgreSQL offers native SSL support for encrypting communication between servers via different connections, while MySQL uses TLS, which it claims provides stronger encryption algorithms.

SQL standard conformance

The American National Standards Institute (ANSI) maintains the SQL language standard, and updates it every few years. The current standard is SQL:2016, and it sets out functions and syntax that every SQL dialect should provide. No version of SQL is 100% ANSI-standard, but some are closer than others. The closer a database follows the SQL standard, the easier it is for developers familiar with other DBMSes to become proficient in it.

PostgreSQL is largely SQL-conformant. Out of 179 mandatory features, PostgreSQL meets 160. MySQL is only partially conformant. Developers at Oracle, which owns MySQL, are working toward a goal of full SQL compliance.

Programming language support

The number of languages each database supports affects how developers can use them. PostgreSQL is written in C and provides programming interfaces for C/C++, Python, JavaScript, R, Delphi, Java, Tcl, Go, Lisp, Erlang, and .Net. MySQL is written in C and C++, and provides APIs for C/C++, R, Delphi, Java, Go, Lisp, Erlang, PHP, Perl, and Node.js.

Replication

Replication is the process of copying data from one computer or server to another, or writing data to multiple servers simultaneously. PostgreSQL and MySQL handle this process differently.

PostgreSQL can perform asynchronous, cascade, and synchronous replication, depending on what the use case requires. Asynchronous replication allows users to create read-only copies from a master database to establish a failsafe backup and distribute workloads. Cascade replication allows a master database to replicate to multiple slave databases. With synchronous replication, data written to a master database is copied automatically to slave databases and all instances perform transactions simultaneously, which reduces the risk of losing a commit.

PostgreSQL also offers single master to one standby, single master to multiple standbys, bidirectional replication, and logical log streaming replication.

MySQL relies on one-way asynchronous replication to copy data between databases. It also supports a single master instance and multiple slave instances, circular replication, and master-to-master replication. Replication might involve data from a single table, data from only certain databases, or data from all available databases.

Let Stitch streamline your data replication process

Fault tolerance

Clustering is a process that uses shared storage to replicate the same set of data to all nodes in an environment. The main objective of using clustering is to ensure fault tolerance, a database’s ability to continue providing uninterrupted service in the event of a failure in the system.

Both PostgreSQL and MySQL support both replication and clustering to ensure high availability and fault tolerance. In particular, PostgreSQL supports synchronous or streaming replications as well as Postgres-XL — a database clustering environment. MySQL supports semisynchronous and synchronous replication in a clustering environment as well as MySQL Cluster — a multimaster technology that focuses on linear scaling.

Community support

Both PostgreSQL and MySQL boast thriving, active communities that provide support to users and developers. A number of third-party organizations also provide paid professional services for both databases.

PostgreSQL provides user support through technical documentation and manuals, mailing lists, a dedicated wiki page, and IRC and Slack channels. Similarly, MySQL offers community support through IRC channels, forums, mailing lists, documentation and manuals, tutorials, and a free-to-all bug database. MySQL users with a paid version of the database have access to free consultations and 24/7 support.

PostgreSQL vs. MySQL: Which is better for your use case?

Organizations will want to consider all of these factors when deciding whether PostgreSQL or MySQL is better for their business and specific use case.

Whichever database you use, chances are you're going to want to replicate data from it to an enterprise data warehouse where it can be used for data analytics. That's the role of data replication tools such as Stitch.

Stitch reduces the time and complexity for replicating data from PostgreSQL and MySQL into a data warehouse, and enables businesses to kick-start their focus on insights. Set up a free trial in as little as five minutes and begin your journey to easier analytics today.

Give Stitch a try, on us

Stitch streams all of your data directly to your analytics warehouse.

Set up in minutesUnlimited data volume during trial