In the very early days of my career, an incident made me realise that perfoming my job irresponsibily will affect me adversely, not because it will affect my position adversely, but because it can affect my life otherwise also. I was part a team that produced a software used by a financial institution where I held my account. A bug in the software caused a failure which made several accounts, including my bank account, inaccessible! Fortunately I wasn't the one who introduced that bug and neither was other software engineer working on the product. It has simply crept through the cracks that the age-old software had developed as it went through many improvements. Something that happens to all the architectures, software or otherwise in the world. That was an enlightening and eve opening experience. But professional karma is not always bad; many times it's good. When the humble work I do for earning my living also improves my living, it gives me immense satisfaction. It means that it's also improving billions of lives that way across the globe.
When I was studying post-graduation in IIT Bombay, I often travelled by train - local and intercity. The online ticketing system for long distant trains was still in its early stages. Local train tickets were still issued at stations and getting one required standing in a long queue. Fast forward to today, you can buy a local train ticket on a mobile App or at a kiosk at the station by paying online through UPI. In my recent trip to IIT Bombay I bought such a ticket using GPay in a few seconds. And know what, UPI uses PostgreSQL as an OLTP database in its system. I didn't have to go through the same experience thank to the same education and the work I am doing. Students studying in my alma-matter no more have to go through the same painful experience now, thanks to many PostgreSQL contributors who once were students and might have similar painful experiences in their own lives.
In PGConf.India, Koji Annoura, who is a Graph database expert talked about o
[...]
I previously blogged about ensuring that the “ON CONFLICT” directive is used in order to avoid vacuum from having to do additional work. I also later demonstrated the characteristics of how the use of the MERGE statement will accomplish the same thing.
You can read the original blogs here Reduce Vacuum by Using “ON CONFLICT” Directive and here Follow-Up: Reduce Vacuum by Using “ON CONFLICT” Directive
Now in another recent customer case, I was chasing down why the application was invoking 10s of thousands of Foreign Key and Constraint violations per day and I began to wonder, if these kinds of errors also caused additional vacuum as described in those previous blogs. Sure enough it DEPENDS.
Let’s set up a quick test to demonstrate:
/* Create related tables: */
CREATE TABLE public.uuid_product_value (
id int PRIMARY KEY,
pkid text,
value numeric,
product_id int,
effective_date timestamp(3)
);
CREATE TABLE public.uuid_product (
product_id int PRIMARY KEY
);
ALTER TABLE uuid_product_value
ADD CONSTRAINT uuid_product_value_product_id_fk
FOREIGN KEY (product_id)
REFERENCES uuid_product (product_id) ON DELETE CASCADE;
/* Insert some mocked up data */
INSERT INTO public.uuid_product VALUES (
generate_series(0,200));
INSERT INTO public.uuid_product_value VALUES (
generate_series(0,10000),
gen_random_uuid()::text,
random()*1000,
ROUND(random()*100),
current_timestamp(3));
/* Vacuum Analyze Both tables */
VACUUM (VERBOSE, ANALYZE) uuid_product;
VACUUM (VERBOSE, ANALYZE) uuid_product_value;
/* Verify that there are no dead tuples: */
SELECT
schemaname,
relname,
n_live_tup,
n_dead_tup
FROM
pg_stat_all_tables
WHERE
relname in ('uuid_product_value', 'uuid_product');
schemaname | relname | n_live_tup | n_dead_tup
------------+--------------------+------------+------------
public | uuid_product_value | 10001 | 0
publicWelcome to Part two of our series about building a High Availability Postgres cluster using Patroni! Part one focused entirely on establishing the DCS using etcd, providing the critical layer that Patroni uses to store metadata and guarantee its leadership token uniqueness across the cluster.With this solid foundation, it's now time to build the next layer in our stack: Patroni itself. Patroni does the job of managing the Postgres service and provides a command interface for node administration and monitoring. Technically the Patroni cluster is complete at the end of this article, but stick around for part three where we add the routing layer that brings everything together.Hopefully you still have the three VMs where you installed etcd. Those will be the same place where everything else happens, so if you haven’t already gone through the steps in part one, come back when you’re ready.Otherwise, let’s get started!
Most PostgreSQL tuning advice that folks chase is quick fixes but not on understanding what made planners choose an path or join over others optimal path. !
Tuning should not start with Analyze on tables involved in the Query but with intend what is causing the issue and why planner is not self sufficient to choose the optimal path.
Most fixes we search for SQL tuning are around,
Add an index.
Rewrite the query.
Bump work_mem.
Done.
Except it’s not done. The same problem comes back, different query, different table, same confusion.
A slow query is a symptom. Statistics, DDL, query style, and PG version are the actual culprit’s.
Before you touch anything, you need to answer five questions — in order:
Most developers skip straight to question two. Many skip to indexes without asking any question at all.
I presented this framework at PGConf India yesterday, a room full of developers and DBA , sharp questions, and a lot of “I’ve hit exactly this” moments.
The slides cover core foundations for approaching Query Tuning and production gotchas including partition pruning, SARGability, CTE fences, and correlated column statistics.
Slide – PostgreSQL Query Tuning: A Foundation Every Database Developer Should Build
This article reviews the November 2025 CommitFest.
For the highlights of the previous two CommitFests, check out our last posts: 2025-07, 2025-09.
...
There is a moment in many database reviews when the room becomes a little too quiet.
Someone asks:
“Which columns in this database are encrypted?”
At first, the answers sound reassuring.
“We use TLS.”
“The disks are encrypted.”
“The application handles sensitive fields.”
And then the real picture starts to emerge.
Some values are encrypted in one service but not another.
Some migrations remembered to apply encryption.
Some scripts did not.
Some backups are safe in theory, but no one wants to test that theory the hard way.
That is the uncomfortable truth of database security:
encryption is often present, but not always enforced where the data actually lives.
That is exactly the problem I wanted to explore with the PostgreSQL extension:
column_encrypt: https://github.com/vibhorkum/column_encrypt
This extension provides transparent column-level encryption using custom PostgreSQL datatypes so developers can read and write encrypted columns without changing their SQL queries.
And perhaps the most human part of this project is this:
the idea for this project started back in 2016.
It stayed with me for years as one of those engineering ideas that never quite leaves your mind — the thought that PostgreSQL itself could enforce encryption at the column level.
Now I’ve finally decided to release it.
This is the first public version. It’s a starting point — useful, practical, and hopefully something the PostgreSQL community can explore and build upon.
Encryption conversations often focus first on infrastructure.
All of these are important.
But once data is inside the database, a different question matters:
What happens if someone gains access to the database itself?
That access might come from:
When using AWS RDS Proxy, the goal is to achieve connection multiplexing – many client connections share a much smaller pool of backend PostgreSQL connections, givng more resources per connection and keeping query execution running smoothly.
However, if the proxy detects that a session has changed internal state in a way it cannot safely track, it pins the client connection to a specific backend connection. Once pinned, that connection can never be multiplexed again. This was the case with a recent database I worked on.
In this case, we observed the following:
What was strange about it all was that the queries involved were relatively simple, with max just one join.
To get to the root cause, one option was to look in pg_stat_statements. However, that approach had two problems:
pg_stat_statements normalizes queries and does not expose the values passed to parameter placeholders.
Instead, to see the actual parameters, we briefly enabled log_statement = 'all'. This immediately surfaced something interesting in the logs, which could be downloaded and reviewed on my own time and pace.
What we saw were statements like SELECT set_config($2,$1,$3) with parameters related to JIT configuration – that was the first real clue.
After tracing the behavior through the stack, the root cause turned out to be surprisingly indirect. The application created new connections through SQLAlchemy’s asyncpg dialect, and we needed to drill down into that driver’s behavior.
During connection initialization, SQLAlchemy runs an on_connect hook:
def connect(conn):
For much of Postgres's history, it has lived in the shadow of other relational systems, and for a time even in the shadow of NoSQL systems. Those shadows have faded, but it is helpful to reflect on this outcome.
On the proprietary side, most database products are now in maintenance mode. The only database to be consistently compared to Postgres was Oracle. Long-term, Oracle was never going to be able to compete against an open source development team, just like Sun's Solaris wasn't able to compete against open source Linux. Few people would choose Oracle's database today, so it is effectively in legacy mode. The Oracle shadow is clearly fading. In fact, almost all enterprise infrastructure software is open source today.
The MySQL shadow is more complex. MySQL is not proprietary, since it is distributed as open source, so it had the potential to ride the open source wave into the enterprise, and it clearly did from the mid-1990s to the mid-2000s. However, something changed, and MySQL has been in steady decline for decades. Looking back, people want to ascribe a reason for the decline:
Last December, I was part of a long enterprise discussion centered on PostgreSQL.
On paper, it looked familiar: a new major release, high availability and scale, Aurora migration, monitoring, operational tooling, and the growing conversation around AI-assisted operations.
The usual ingredients were all there.
But somewhere in the middle of that day, the tone of the room changed.
It did not change when we talked about new PostgreSQL capabilities. It changed when the conversation moved to upgrades, patching, monitoring quality, and operational control.
That was the moment I realized this was not really a feature discussion.
It was a trust discussion.
Not trust in PostgreSQL as a database. That question is mostly behind us.
It was trust in something more practical: can this platform evolve without exhausting the team responsible for it? Can it scale without becoming harder to reason about? Can it be upgraded without becoming a quarterly trauma ritual? Can it be monitored without operators drowning in false signals? Can it support modernization without making every change feel dangerous?
That, to me, is where the PostgreSQL conversation has matured.
A modern PostgreSQL platform is not defined only by what it can do. It is defined by how calmly it can change.
This matters because PostgreSQL is no longer entering the enterprise through side doors. In many organizations, it is already trusted with serious workloads and is increasingly central to modernization plans.
That changes the questions.
A few years ago, teams often asked whether PostgreSQL was ready for enterprise use. Today, the better question is whether the operating model around PostgreSQL is ready for enterprise reality.
Because the database can be strong while the surrounding practice is weak.
That is where many teams struggle. They like PostgreSQL, but lag on upgrades. They have HA designs, but unclear failure playbooks. They have monitoring, but poor signal qualit
[...]In the Part 1, we explored the general concepts of MVCC and the implications of storing data snapshots either out-of-place or within heap storage, we can now map these methodologies to specific database engines.
The PostgreSQL MVCC implementation aligns with the DatabaseI model, whereas Oracle and MySQL are closely related to the DatabaseO model. Specifically, Oracle utilizes block versioning and stores older versions in a separate storage area known as UNDO, while PostgreSQL employs row versioning.
These engines further optimize their respective in-place or out-of-place MVCC strategies:
Early in my PostgreSQL journey, I often sensed that a conversation between two Postgres professionals inevitably revolves around vacuuming. That lighthearted observation still remains relevant, as my LinkedIn feeds are often filled with discussions around vacuuming and comparing PostgreSQL’s Multi-Version Concurrency Control (MVCC) implementation to other engines like Oracle or MySQL. Given that people are naturally drawn to the most complex components of a system, I will continue this journey by exploring a detailed comparison of these database architectures focused on the MVCC implementations.
Stone age databases relied on strict locking mechanisms to handle concurrency, which proved inefficient under heavy load. In these traditional models, a read operation required a shared lock that prevented other transactions from updating the record. Conversely, write operations required exclusive locks that blocked incoming reads. This resulted in significant lock contention, where readers blocked writers and writers blocked readers.
To solve this, RDBMS implemented MVCC. The idea was very simple. Rather than overwriting data immediately, maintain multiple versions of data simultaneously. This allows transactions to view a consistent snapshot of the database as it existed at a specific point in time. For instance, if User 1 starts reading a table just before User 2 starts modifying a record, User 1 sees the original version of the data without hindering User 2’s progress. Without MVCC, the system would be forced to either serialize all access — making User 2 wait — or risk data consistency anomalies like dirty or non-repeatable reads where User 1 sees uncommitted changes that might eventually be rolled back.
Database engines utilize various architectures to manage this data versioning. A particularly notable point of discussion is the comparison between “in-place” and “out-of-place” data versioning techniques. Let’s examine these approaches more closely.
One of the great things about PostgreSQL's jsonb type is the flexibility it gives you — you can store whatever structure you need without defining columns up front. But that flexibility comes with a trade-off: there's nothing stopping bad data from getting in. You can slap a CHECK constraint on a jsonb column, but writing validation logic in SQL or PL/pgSQL for anything beyond the trivial gets ugly fast.
I've been working on a PostgreSQL extension called json_schema_validate that solves this problem by letting you validate JSON and JSONB data against JSON Schema specifications directly in the
This is the second in a series of three blog posts covering the new AI functionality in pgAdmin 4. In the first post, I covered LLM configuration and the AI-powered analysis reports. In this post, I'll introduce the AI Chat agent in the query tool, and in the third, I'll explore the AI Insights feature for EXPLAIN plan analysis.If you've ever found yourself staring at a database schema you didn't design, trying to work out the right joins to answer a seemingly simple question, you'll appreciate what the AI Chat agent brings to pgAdmin's query tool. Rather than having to alt-tab to an external AI service, paste in your schema, describe what you need, and then copy the resulting SQL back into your editor, the entire conversation now happens within the query tool itself, with full awareness of your actual database structure.
The community met on Wednesday, March 4, 2026 for the 7. PostgreSQL User Group NRW MeetUp (Cologne, ORDIX AG). It was organised by Dirk Krautschick and Andreas Baier.
Speakers:
PostgreSQL Berlin March 2026 Meetup took place on March 5, 2026 organized by Andreas Scherbaum and Sergey Dudoladov.
Speakers:
Kai Wagner wrote about his experience at the meetup PostgreSQL Berlin Meetup - March 2026
Andreas Scherbaum wrote a blog posting about the Meetup.
SCALE 23x (March 5-8, 2026) had a dedicated PostgreSQL track, filled by the following contributions
Trainings:
Talks:
SCALE 23x PostgreSQL Booth volunteers:
This is the first in a series of three blog posts covering the new AI functionality coming in pgAdmin 4. In this post, I'll walk through how to configure the LLM integration and introduce the AI-powered analysis reports; in the second, I'll cover the AI Chat agent in the query tool; and in the third, I'll explore the AI Insights feature for EXPLAIN plan analysis.Anyone who manages PostgreSQL databases in a professional capacity knows that keeping on top of security, performance, and schema design is an ongoing endeavour. You might have a checklist of things to review, or perhaps you rely on experience and intuition to spot potential issues, but it is all too easy for something to slip through the cracks, especially as databases grow in complexity. We've been thinking about how AI could help with this, and I'm pleased to introduce a suite of AI-powered features in pgAdmin 4 that bring large language model analysis directly into the tool you already use every day.
In the previous article we covered how the PostgreSQL planner reads pg_class and pg_statistic to estimate row counts, choose join strategies, and decide whether an index scan is worth it. The message was clear: when statistics are wrong, everything else goes with it.
PostgreSQL 18 changed that. Two new functions: pg_restore_relation_stats and pg_restore_attribute_stats write numbers directly into the catalog tables. Combined with pg_dump --statistics-only, you can treat optimizer statistics as a deployable artifact. Compact, portable, plain SQL.
The feature was driven by the upgrade use case. In the past, major version upgrades used to leave pg_statistic empty, forcing you to run ANALYZE. Which might take hours on large clusters. With PostgreSQL 18 upgrades now transfer statistics automatically. But that's just the beginning. The same logic lets you export statistics from production and inject them anywhere - test database, local debugging, or as part of CI pipelines.
Your CI database has 1,000 rows. Production has 50 million. The planner makes completely different decisions for each. Running EXPLAIN in CI tells you nothing about the production plan. This is the core premise behind RegreSQL. Catching query plan regressions in CI is far more reliable when the planner sees production-scale statistics.
Same applies to debugging. A query is slow in production and you want to reproduce the plan locally, but your database has different statistics, and planner chooses the predictable path. Porting production stats can provide you that snapshot of thinking planner has to do in production, without actually going to production.
The first of functi
[...]I just gave a new presentation at SCALE titled The Wonderful World of WAL. I am excited to have a second new talk this year. (I have one more queued up.)
I have always wanted to do a presentation about the write-ahead log (WAL) but I was worried there was not enough content for a full talk. As more features were added to Postgres that relied on the WAL, the talk became more feasible, and at 103 slides, maybe I waited too long.
I had a full hour to give the talk at SCALE, and that was helpful. I was able to answer many questions during the talk, and that was important — many of the later features rely on earlier ones, e.g., point-in-time recovery (PITR) relies heavily on crash recovery, and if you don't understand how crash recovery works, you can't understand PITR. By taking questions at the end of each section, I could be sure everyone understood. The questions showed that the audience of 46 understood the concepts because they were asking about the same issues we dealt with in designing the features:
In this article I walk you through the journey of adding the pg_crash extension to the new CloudNativePG extensions project. It explores the transition from legacy standalone repositories to a unified, Dagger-powered build system designed for PostgreSQL 18 and beyond. By focusing on the Image Volume feature and minimal operand images, the post provides a step-by-step guide for community members to contribute and maintain their own extensions within the CloudNativePG ecosystem.
The last PG Phriday article focused on the architecture of a Patroni cluster—the how and why of the design. This time around, it’s all about actually building one. I’ve often heard that operating Postgres can be intimidating, and Patroni is on a level above that. Well, I won’t argue on the second count, but I can try to at least ease some of the pain.To avoid an overwhelming deluge consisting of twenty pages of instructions, I’ve split this article into a series of three along these lines:
Number of posts in the past two months
Number of posts in the past two months
Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.