PostgreSQL 12: New Features and Enhancements

Shameel

4.00/5 (2 votes)

21 Sep 2019CPOL5 min read

This article walks you through the summary of most important new features. enhancements and breaking changes in PostgreSQL 12.. The post PostgreSQL 12: New Features and Enhancements appeared first on The Developer Space..

Introduction

PostgreSQL 12 is the next major release of the world’s most popular and feature-rich open source database. The stable version of PostgreSQL 12 is scheduled to be released in late 2019. Beta 4 of version 12 was released on 12th Sep 2019.

This article walks you through the summary of most important new features. enhancements and breaking changes in PostgreSQL 12.

New Features & Enhancements

CTE Enhancements

In PostgreSQL, CTEs are optimization fences (outer query restrictions are not passed on to CTEs) and the database evaluates the query inside the CTE and caches the results (i.e., materialized results) and outer WHERE clauses are applied later when the outer query is processed, which means either a full table scan or a full index seek is performed and results in horrible performance for large tables.

WITH AllUsers AS (SELECT * FROM Users)
SELECT * FROM AllUsers WHERE Id = 100;

To overcome this, you rewrite this query in PostgreSQL as:

WITH UserRecord AS (SELECT * FROM Users WHERE Id = 100)
SELECT * FROM UserRecord;

PostgreSQL 12 addresses this problem by introducing query optimizer hints to enable us to control if the CTE should be materialized or not: MATERIALIZED, NOT MATERIALIZED. So, the query can be rewritten as follows to achieve better performance.

WITH AllUsers AS NOT MATERIALIZED (SELECT * FROM Users)
SELECT * FROM AllUsers WHERE Id = 100;

This effectively in-lines the CTE expression and results in significant performance improvement.

Generated Columns

SQL Server and Oracle have long had Computed Columns and the lack of computed columns in PostgreSQL made migrations from other Databases quite difficult. PostgreSQL 12 introduces Generated Columns to address this problem.

A generated column is to columns what a view is to a table. The value of the column is always computed from other columns in the table. A generated column can either be virtual or stored. The values for virtual columns are computed on the fly during query time and they don’t take storage space. The values for stored columns are pre-computed and stored as part of table data. PostgreSQL currently implements only stored generated columns.

The GENERATED ALWAYS AS clause is used to create Generated columns

CREATE TABLE Employee (
    ...,
    dob timestamp,
    age integer GENERATED ALWAYS AS (date_part('year', CURRENT_TIME - dob)) STORED
);

In PostgreSQL 11.x and older, the only way to do this was to declare age as a normal column and write a trigger to populate the column during inserts and updates

CREATE OR REPLACE FUNCTION calc_age() RETURNS TRIGGER AS $body$
BEGIN
  NEW.age := date_part('year', CURRENT_TIME - NEW.dob::timestamp);
  RETURN NEW;
END; 
$body$ LANGUAGE plpgsql;

The SQL Server Syntax for Computed Columns

Contrast this to SQL Server syntax. There is no special keyword in SQL Server to declare a computed column. You just specify the expression that makes us the computed column after the AS clause.

-- SQL Server Syntax
CREATE TABLE Employee (
    ...,
    dob datetime,
    age AS DATEDIFF(year,dob,GETDATE()) PERSISTED
);

Note the PERSISTED clause which is the equivalent of PostgreSQL’s STORED clause. SQL Server also supports non-persisted computed columns, you just don’t specify the PERSISTED clause.

So how is a Generated column different from a normal column with a `DEFAULT` clause?

The column default is evaluated once when the row is first inserted if no other value was provided; a generated column is updated whenever the row changes and cannot be overridden.
A column with DEFAULT constraint can be given a value in an INSERT or UPDATE statement. Generated columns cannot be given values, they’re always computed.
A column default cannot refer to other columns of the table, whereas a generated columns is specifically meant to do so.
A column default can use volatile functions, for example random() or current_time, generated columns cannot.

Re-Index Improvements

REINDEX CONCURRENTLY can rebuild an index without blocking writes to its table. The CONCURRENTLY clause was earlier available only in the CREATE INDEX statement.

CREATE TABLE test (x INTEGER);
INSERT INTO test SELECT generate_series(1, 1000000);
CREATE INDEX i_test ON test (x);
REINDEX INDEXCONCURRENTLY i_test;

In previous version of PostgreSQL, REINDEX on large tables frequently caused blocking writes and severe performance issues.

JIT Compilation enabled by default

Enable Just-in-Time (JIT) compilation by default, if the server has been built with support for it. Note that this support is not built by default, but has to be selected explicitly while configuring the build. This is especially useful for data warehouse queries.

Query Parallelism Improvements

Allow parallelized queries when in SERIALIZABLE isolation mode. Previously, parallelism was disabled when in this mode.

PL/pgSQL Enhancements

Allow SQL commands and variables with the same names as those commands to be used in the same PL/pgSQL function. For example, allow a variable called comment to exist in a function that calls the COMMENT SQL command. Previously this combination caused a parse error.

SQL/JSON Path Queries

PostgreSQL 12 now allows execution of JSON path queries per the SQL/JSON specification in the SQL:2016 standard. JSON path expressions let you evaluate a variety of arithmetic expressions and functions and compare values within JSON documents, similar to XPath expressions for XML. These expressions can be accelerated with GIN indexes which improves the execution of lookups across JSON data.

Partitioning Improvements

There is significant performance improvement in PostgreSQL 12 when processing tables with thousands of partitions for command that only need to use a subset of all the available partitions. The INSERT and COPY into a partitioned table also perform significantly faster compared to previous versions. ATTACH PARTITION can now be performed without blocking concurrent queries on the partitioned table. Partitioned tables can now be referenced as foreign keys. Partition bounds can now be expressions.

Collation Improvements

PostgreSQL 12 now supports case-insensitive and accent-insensitive comparisons for ICU provided collations, also known as “nondeterministic collations“. When used, these collations can provide convenience for comparisons and sorts, but can also lead to a performance penalty as a collation may need to make additional checks on a string.

MCV statistics now supports multiple columns

The CREATE STATISTICS command allows most-common-value statistics for multiple columns; previously only a single correlation value was recorded for multiple columns.

CREATE STATISTICS stts3 (mcv) ON state, city FROM zipcodes;
ANALYZE zipcodes;

Migrating from older versions

Some changes in PostgreSQL 12 may affect compatibility with previous releases.

recovery.conf is no longer used, and the server will not start if that file exists. recovery.signal and standby.signal files are now used to switch into non-primary mode. trigger_file has been renamed to promote_trigger_file. The standby_mode setting has been removed.
In new btree indexes, the maximum index entry length is reduced by eight bytes, to improve handling of duplicate entries. This means that a REINDEX operation on an index pg_upgrade’d from a previous release could potentially fail.
DROP IF EXISTS FUNCTION/PROCEDURE/AGGREGATE/ROUTINE to generate an error if no argument list is supplied and there are multiple matching objects

References

https://www.postgresql.org/docs/12/release-12.html

https://www.postgresql.org/developer/roadmap/

https://www.postgresql.org/docs/12/ddl-generated-columns.html

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)