Top 3 Features in PostgreSQL 18

It is that time of the year again. The first release candidate of PostgreSQL 18 is out, and things look promising. We should expect General Availability in the next 2-4 weeks.

Exciting times!

Over the past many years and as many releases, the PostgreSQL community has done a phenomenal job of being disciplined about the annual release process. And we have done so averaging 150+ new features with each release!

For the upcoming v18, here are the top three features I am most excited about:

#1 – Asynchronous I/IO

PostgreSQL 18 introduces a significant under‑the‑hood change with its Asynchronous I/O (AIO) subsystem, which fundamentally alters how the database performs disk reads.

What asynchronous I/O does

Historically, PostgreSQL issued synchronous disk reads: each backend process would call the operating system, wait for the data to arrive, and then continue processing. This “one book at a time” model left CPUs idle whenever storage was slow (especially on network‑attached disks), limiting throughput. The new subsystem lets a backend queue multiple read requests at once, allowing other work to continue while data is being fetched. When the requested blocks are ready, PostgreSQL copies them directly into shared buffers, eliminating reliance on the kernel’s readahead heuristics.

How it works

1. io_method – A new server setting chooses the implementation:

sync replicates the old synchronous behaviour.
worker uses dedicated I/O worker processes. The main backend enqueues read requests and continues execution while these workers interact with the kernel. You can configure the number of workers via io_workers.
io_uring (Linux only) leverages the kernel’s high‑performance io_uring API to submit and complete I/O requests without separate worker processes.

2. Tuning and monitoring – New variables io_combine_limit and io_max_combine_limit determine how many adjacent blocks are combined into a single request. The default values for effective_io_concurrency and maintenance_io_concurrency are increased to 16, which better reflects modern hardware and allows more read‑ahead for sequential scans. The pg_aios view exposes outstanding asynchronous I/O operations for monitoring.

3. Where it applies – In PostgreSQL 18, AIO covers sequential scans, bitmap‑heap scans, and maintenance operations like VACUUM. Writes remain synchronous (for now).

Why it matters

Reduced latency and higher throughput – Benchmarks show that enabling AIO can double or even triple read performance in cold‑cache situations. When using worker or io_uring, a simple COUNT(*) on a 3.5 GB table dropped from ~15s to ~10s and ~5.7s, respectively. This is particularly beneficial in cloud environments with network‑attached storage, where individual reads can take several milliseconds.
Better resource utilization – By issuing multiple reads concurrently, PostgreSQL reduces idle CPU time and makes more efficient use of available I/O bandwidth.
Greater control for DBAs – Administrators can fine‑tune concurrency (via effective_io_concurrency and io_combine_limit), choose the best io_method for their environment, and monitor outstanding operations using pg_aios. The default effective_io_concurrency is now 16.

In summary, the new asynchronous I/O subsystem in PostgreSQL 18 gives administrators the ability to lower the impact of I/O latency and boost read throughput by allowing multiple outstanding read requests. It introduces flexible options (sync, worker, io_uring) and new tuning parameters, and early benchmarks suggest significant gains for sequential scans and maintenance tasks – especially in cloud and network‑attached storage environments.

#2 – UUID v7

PostgreSQL 18’s uuidv7() function matters in a distributed‑database context because it allows every node or client in a cluster to generate unique, ordered identifiers without a central coordinator.

Why coordination is a problem with traditional IDs

Auto‑incrementing integers require a single source of truth to avoid clashes. In a sharded or multi‑master setup, coordinating sequence numbers across nodes introduces network latency and the risk of downtime if the sequence generator fails.
Random UUIDs (v4) solve the coordination problem because every node can create an ID independently, but their randomness scatters inserts throughout a B‑tree index. This poor locality means the newest records are written to arbitrary positions on disk, causing more page splits and cache misses.

How UUID v7 improves distributed systems

Local generation, global uniqueness – Like other UUIDs, version 7 values are 128‑bit identifiers that can be generated anywhere without coordination. This makes them ideal for distributed systems, microservices and multi‑master databases where each node must produce unique IDs.
Temporal ordering – UUID v7 encodes the current Unix timestamp (to millisecond resolution) in its most significant 48 bits and uses random bits for the remainder. Because the timestamp portion increases monotonically, values generated at roughly the same time are adjacent in an index. Benchmarks show that time‑ordered UUIDs drastically reduce write‑ahead log (WAL) volume and improve index locality compared to random UUIDs. This means that inserts follow a mostly sequential pattern, improving write performance and cache hit rates.
Merge‑friendly and pre‑generatable – When data from different shards or offline services must be merged, UUIDs avoid conflicts because their probability of collision is negligible. Clients can generate IDs ahead of time (e.g., in the browser) and then commit them later; this reduces round‑trips to the database and simplifies offline workflows.

Summary

In a distributed production environment, using uuidv7() as primary keys provides:

Decentralized uniqueness – each shard or client can create IDs without central coordination, avoiding the complexity of distributed sequence generators.
Ordered, index‑friendly inserts – the time‑based prefix means that new IDs cluster at the “right‑most” side of a B‑tree, preserving index locality and improving both insert and read performance.

For distributed databases that must scale horizontally while maintaining performance, uuidv7() offers a practical solution: globally unique identifiers that are naturally sortable and efficient for database indexes.

#3 – OAuth 2.0 Authentication

PostgreSQL 18 adds a native OAuth 2.0 authentication method, allowing the database to accept standards‑based bearer tokens instead of passwords. This feature consists of a server‑side authentication hook and new client‑side options in libpq.

How the OAuth method works

1. New oauth entry in pg_hba.conf – When you specify oauth as the authentication method in pg_hba.conf, PostgreSQL expects clients to present an OAuth 2.0 bearer token. Tokens conform to RFC 6750 and are validated by a pluggable library. The server must be compiled with –with-libcurl to enable this feature.

2. Bearer token validation – Validation is handled by an OAuth validator library. The GUC oauth_validator_libraries lists shared libraries that implement token verification; if more than one is loaded, a validator option in the pg_hba.conf entry selects which one to use. Validators typically perform token introspection or call the identity provider’s /userinfo endpoint.

3. Connection‑specific parameters – When connecting, the libpq client can supply several new parameters (in the connection string or environment):

oauth_token (or device‑flow parameters) – the bearer token to present; libpq also supports the OAuth 2 Device Authorization flow for obtaining tokens interactively.
issuer – the HTTPS URL of the identity provider/authorization server; it must match the issuer in the provider’s discovery document.
scope – a space‑separated list of scopes the client needs.
validator – which validation library to use if multiple are configured.
map – optional mapping rules from OAuth identities to PostgreSQL role names.
delegate_ident_mapping – lets the validator library decide user‑role mapping without reference to pg_ident.conf; this provides flexibility at the cost of trusting the validator completely.

Why this is useful

Integration with enterprise SSO – OAuth 2.0 is the industry standard for token‑based authentication. With native support, PostgreSQL can integrate directly with identity providers like Okta or Azure AD, enabling single sign‑on and centralised user management without proxies or custom hooks.
Stronger security and compliance – Bearer tokens can be scoped, short‑lived and revocable. Using tokens avoids storing long‑term passwords in applications, and revocation in the identity provider immediately blocks database access. Administrators can enforce fine‑grained scopes, such as limiting a token to read‑only queries, by setting the scope parameter.
Client‑side support – libpq implements the OAuth Device Authorization flow (RFC 8628) and accepts tokens provided directly in the connection string, enabling command‑line tools like psql to authenticate via OAuth without manual token copying. When the server’s issuer parameter matches the client’s oauth_issuer, the connection proceeds; otherwise it is rejected.

In Summary

The OAuth inclusion in PostgreSQL 18 brings first‑class support for OAuth 2.0 bearer tokens. Server administrators can configure pg_hba.conf to use oauth, specify token validators via oauth_validator_libraries, and define identity‑to‑role mappings. Clients can present tokens or use the device flow to obtain them on the fly. This feature enables modern SSO integration, centralised access control and token‑based security without relying on external proxies or custom authentication mechanisms.

3 Features I am Looking Forward to in PostgreSQL 18

#1 – Asynchronous I/IO

What asynchronous I/O does

How it works

Why it matters

#2 – UUID v7

Why coordination is a problem with traditional IDs

How UUID v7 improves distributed systems

Summary

#3 – OAuth 2.0 Authentication

How the OAuth method works

Why this is useful

In Summary

References

Leave A Comment Cancel Comment

Our Projects

Quick Links

Contact Info

Follow us at

Archives

Categories