Application use v5

Learn about the PGD application from a user perspective.

Application behavior

PGD supports replicating changes made on one node to other nodes.

PGD, by default, replicates all changes from INSERT, UPDATE, DELETE and TRUNCATE operations from the source node to other nodes. Only the final changes are sent, after all triggers and rules are processed. For example, INSERT ... ON CONFLICT UPDATE sends either an insert or an update, depending on what occurred on the origin. If an update or delete affects zero rows, then no changes are sent.

You can replicate INSERT without any preconditions.

For updates and deletes to replicate on other nodes, PGD must be able to identify the unique rows affected. PGD requires that a table have either a PRIMARY KEY defined, a UNIQUE constraint, or an explicit REPLICA IDENTITY defined on specific columns. If one of those isn't defined, a warning is generated, and later updates or deletes are explicitly blocked. If REPLICA IDENTITY FULL is defined for a table, then a unique index isn't required. In that case, updates and deletes are allowed and use the first non-unique index that's live, valid, not deferred, and doesn't have expressions or WHERE clauses. Otherwise, a sequential scan is used.

You can use TRUNCATE even without a defined replication identity. Replication of TRUNCATE commands is supported, but take care when truncating groups of tables connected by foreign keys. When replicating a truncate action, the subscriber truncates the same group of tables that was truncated on the origin, either explicitly specified or implicitly collected by CASCADE, except in cases where replication sets are defined. See Replication sets for further details and examples. This works correctly if all affected tables are part of the same subscription. But if some tables to truncate on the subscriber have foreign-key links to tables that aren't part of the same (or any) replication set, then applying the truncate action on the subscriber fails.

Row-level locks taken implicitly by INSERT, UPDATE, and DELETE commands are replicated as the changes are made. Table-level locks taken implicitly by INSERT, UPDATE, DELETE, and TRUNCATE commands are also replicated. Explicit row-level locking (SELECT ... FOR UPDATE/FOR SHARE) by user sessions isn't replicated, nor are advisory locks. Information stored by transactions running in SERIALIZABLE mode isn't replicated to other nodes. The transaction isolation level of SERIALIAZABLE is supported, but transactions aren't serialized across nodes in the presence of concurrent transactions on multiple nodes.

If DML is executed on multiple nodes concurrently, then potential conflicts might occur if executing with asynchronous replication. You must either handle these or avoid them. Various avoidance mechanisms are possible, discussed in Conflicts.

Sequences need special handling, described in Sequences.

Binary data in BYTEA columns is replicated normally, allowing "blobs" of data up to 1 GB. Use of the PostgreSQL "large object" facility isn't supported in PGD.

Rules execute only on the origin node so aren't executed during apply, even if they're enabled for replicas.

Replication is possible only from base tables to base tables. That is, the tables on the source and target on the subscription side must be tables, not views, materialized views, or foreign tables. Attempts to replicate tables other than base tables result in an error. DML changes that are made through updatable views are resolved to base tables on the origin and then applied to the same base table name on the target.

PGD supports partitioned tables transparently, meaning that you can add a partitioned table to a replication set and changes that involve any of the partitions are replicated downstream.

By default, triggers execute only on the origin node. For example, an INSERT trigger executes on the origin node and is ignored when you apply the change on the target node. You can specify for triggers to execute on both the origin node at execution time and on the target when it's replicated ("apply time") by using ALTER TABLE ... ENABLE ALWAYS TRIGGER. Or, use the REPLICA option to execute only at apply time: ALTER TABLE ... ENABLE REPLICA TRIGGER.

Some types of trigger aren't executed on apply, even if they exist on a table and are currently enabled. Trigger types not executed are:

  • Statement-level triggers (FOR EACH STATEMENT)
  • Per-column UPDATE triggers (UPDATE OF column_name [, ...])

PGD replication apply uses the system-level default search_path. Replica triggers, stream triggers, and index expression functions can assume other search_path settings that then fail when they execute on apply. To prevent this from occurring, use any of these techniques:

  • Resolve object references clearly using either only the default search_path.
  • Always use fully qualified references to objects, e.g., schema.objectname.
  • Set the search path for a function using ALTER FUNCTION ... SET search_path = ... for the functions affected.

PGD assumes that there are no issues related to text or other collatable datatypes, i.e., all collations in use are available on all nodes, and the default collation is the same on all nodes. Replicating changes uses equality searches to locate Replica Identity values, so this does't have any effect except where unique indexes are explicitly defined with nonmatching collation qualifiers. Row filters might be affected by differences in collations if collatable expressions were used.

PGD handling of very long "toasted" data in PostgreSQL is transparent to the user. The TOAST "chunkid" values likely differ between the same row on different nodes, but that doesn't cause any problems.

PGD can't work correctly if Replica Identity columns are marked as external.

PostgreSQL allows CHECK() constraints that contain volatile functions. Since PGD re-executes CHECK() constraints on apply, any subsequent re-execution that doesn't return the same result as before causes data divergence.

PGD doesn't restrict the use of foreign keys. Cascading FKs are allowed.

Nonreplicated statements

None of the following user commands are replicated by PGD, so their effects occur on the local/origin node only:

  • Cursor operations (DECLARE, CLOSE, FETCH)
  • Execution commands (DO, CALL, PREPARE, EXECUTE, EXPLAIN)
  • Session management (DEALLOCATE, DISCARD, LOAD)
  • Parameter commands (SET, SHOW)
  • Constraint manipulation (SET CONSTRAINTS)
  • Locking commands (LOCK)
  • Table maintenance commands (VACUUM, ANALYZE, CLUSTER, REINDEX)
  • Async operations (NOTIFY, LISTEN, UNLISTEN)

Since the NOTIFY SQL command and the pg_notify() functions aren't replicated, notifications aren't reliable in case of failover. This means that notifications can easily be lost at failover if a transaction is committed just when the server crashes. Applications running LISTEN might miss notifications in case of failover.

This is true in standard PostgreSQL replication, and PGD doesn't yet improve on this. CAMO and Eager Replication options don't allow the NOTIFY SQL command or the pg_notify() function.

DML and DDL replication

PGD doesn't replicate the DML statement. It replicates the changes caused by the DML statement. For example, an UPDATE that changed two rows replicates two changes, whereas a DELETE that didn't remove any rows doesn't replicate anything. This means that the results of executing volatile statements are replicated, ensuring there's no divergence between nodes as might occur with statement-based replication.

DDL replication works differently to DML. For DDL, PGD replicates the statement, which then executes on all nodes. So a DROP TABLE IF EXISTS might not replicate anything on the local node, but the statement is still sent to other nodes for execution if DDL replication is enabled. Full details are covered in DDL replication.

PGD works to ensure that intermixed DML and DDL statements work correctly, even in the same transaction.

Replicating between different release levels

PGD is designed to replicate between nodes that have different major versions of PostgreSQL. This feature is designed to allow major version upgrades without downtime.

PGD is also designed to replicate between nodes that have different versions of PGD software. This feature is designed to allow version upgrades and maintenance without downtime.

However, while it's possible to join a node with a major version in a cluster, you can't add a node with a minor version if the cluster uses a newer protocol version. Doing so returns an error.

Both of these features might be affected by specific restrictions. See Release notes for any known incompatibilities.

Replicating between nodes with differences

By default, DDL is automatically sent to all nodes. You can control this manually, as described in DDL replication, and you can use it to create differences between database schemas across nodes. PGD is designed to allow replication to continue even with minor differences between nodes. These features are designed to allow application schema migration without downtime or to allow logical standby nodes for reporting or testing.

Currently, replication requires the same table name on all nodes. A future feature might allow a mapping between different table names.

It's possible to replicate between tables with dissimilar partitioning definitions, such as a source that's a normal table replicating to a partitioned table, including support for updates that change partitions on the target. It can be faster if the partitioning definition is the same on the source and target since dynamic partition routing doesn't need to execute at apply time. For details, see Replication sets.

By default, all columns are replicated.

PGD replicates data columns based on the column name. If a column has the same name but a different datatype, PGD attempt to cast from the source type to the target type, if casts were defined that allow that.

PGD supports replicating between tables that have a different number of columns.

If the target has missing columns from the source, then PGD raises a target_column_missing conflict, for which the default conflict resolver is ignore_if_null. This throws an error if a non-NULL value arrives. Alternatively, you can also configure a node with a conflict resolver of ignore. This setting doesn't throw an error but silently ignores any additional columns.

If the target has additional columns not seen in the source record, then PGD raises a source_column_missing conflict, for which the default conflict resolver is use_default_value. Replication proceeds if the additional columns have a default, either NULL (if nullable) or a default expression. It throws an error and halts replication if not.

Transform triggers can also be used on tables to provide default values or alter the incoming data in various ways before apply.

If the source and the target have different constraints, then replication is attempted, but it might fail if the rows from source can't be applied to the target. Row filters can help here.

Replicating data from one schema to a more relaxed schema won't cause failures. Replicating data from a schema to a more restrictive schema can be a source of potential failures. The right way to solve this is to place a constraint on the more relaxed side, so bad data can't be entered. That way, no bad data ever arrives by replication, so it never fails the transform into the more restrictive schema. For example, if one schema has a column of type TEXT and another schema defines the same column as XML, add a CHECK constraint onto the TEXT column to enforce that the text is XML.

You can define a table with different indexes on each node. By default, the index definitions are replicated. See DDL replication to specify how to create an index on only a subset of nodes or just locally.

Storage parameters, such as fillfactor and toast_tuple_target, can differ between nodes for a table without problems. An exception to that is that the value of a table's storage parameter user_catalog_table must be identical on all nodes.

A table being replicated must be owned by the same user/role on each node. See Security and roles for further discussion.

Roles can have different passwords for connection on each node, although by default changes to roles are replicated to each node. See DDL replication to specify how to alter a role password on only a subset of nodes or locally.

Comparison between nodes with differences

LiveCompare is a tool for data comparison on a database, against PGD and non-PGD nodes. It needs a minimum of two connections to compare against and reach a final result.

Since LiveCompare 1.3, you can configure with all_bdr_nodes set. This setting saves you from clarifying all the relevant DSNs for each separate node in the cluster. An EDB Postgres Distributed cluster has N amount of nodes with connection information, but it's only the initial and output connection that LiveCompare 1.3+ needs to complete its job. Setting logical_replication_mode states how all the nodes are communicating.

All the configuration is done in a .ini file named bdrLC.ini, for example. Find templates for this configuration file in /etc/2ndq-livecompare/.

While LiveCompare executes, you see N+1 progress bars, N being the number of processes. Once all the tables are sourced, a time displays as the transactions per second (tps) was measured. This continues to count the time, giving you an estimate and then a total execution time at the end.

This tool offers a lot of customization and filters, such as tables, schemas, and replication_sets. LiveCompare can use stop-start without losing context information, so it can run at convenient times. After the comparison, a summary and a DML script are generated so you can review it. Apply the DML to fix any differences found.

General rules for applications

PGD uses replica identity values to identify the rows to change. Applications can cause difficulties if they insert, delete, and then later reuse the same unique identifiers. This is known as the ABA problem. PGD can't know whether the rows are the current row, the last row, or much older rows.

Similarly, since PGD uses table names to identify the table against which changes are replayed, a similar ABA problem exists with applications that create, drop, and then later reuse the same object names.

These issues give rise to some simple rules for applications to follow:

  • Use unique identifiers for rows (INSERT).
  • Avoid modifying unique identifiers (UPDATE).
  • Avoid reusing deleted unique identifiers.
  • Avoid reusing dropped object names.

In the general case, breaking those rules can lead to data anomalies and divergence. Applications can break those rules as long as certain conditions are met, but use caution: while anomalies are unlikely, they aren't impossible. For example, you can reuse a row value as long as the DELETE was replayed on all nodes, including down nodes. This might normally occur in less than a second but can take days if a severe issue occurred on one node that prevented it from restarting correctly.

Timing considerations and synchronous replication

Being asynchronous by default, peer nodes might lag behind, making it possible for a client connected to multiple PGD nodes or switching between them to read stale data.

A queue wait function is provided for clients or proxies to prevent such stale reads.

The synchronous replication features of Postgres are available to PGD as well. In addition, PGD provides multiple variants for more synchronous replication. See Durability and performance options for an overview and comparison of all variants available and its different modes.

Use of table access methods (TAMs) in PGD

PGD 5.0 supports two table access methods released with EDB Postgres 15.0. These two table access methods have been certified and allowed in PGD 5.0:

  • Auto cluster
  • Ref data

Any other TAM is restricted until certified by EDB. If you are planning to use any of the table access method on a table, you need to configure that TAM on each participating node in the PGD cluster. To configure auto cluster or ref data TAM, follow these steps on each node:

  1. Update postgresql.conf to specify TAMs autocluster or refdata for the shared_preload_libraries parameter.
  2. Restart the server and execute CREATE EXTENSION autocluster; or CREATE EXTENSION refdata;.

After you create the extension, you can use TAM to create a table using CREATE TABLE test USING autocluster; or CREATE TABLE test USING refdata;. This replicates to all the PGD nodes. For more information on these table access methods, see CREATE TABLE.