Setting up PGCluster

From Alpine Linux
Revision as of 05:52, 26 August 2023 by Zcrayfish (talk | contribs) (→‎References: Rescued a dead link and removed some others.)

PGCluster is a fork of PostgreSQL that allows multi-master replication. This document explains how to set it up.

Notes

  • All hosts in conf files are referenced by name - these must resolve properly to thier IPs.
  • PGCluster xml conf files are case sensitive
  • PGCluster 1.9.0 is based off PostgreSQL 8.3

Installation

First install the packages

apk add pgcluster pgcluster-client

For recovery, you'll need ssh and rsync

apk add openssh rsync

Configuration

Edit /etc/hosts file if necessary so all names can be resolved.

Check the data path in /etc/conf.d/pgcluster. Also check here to see which processes the init script will start on this host.

Create the database files (same as for postgresql)

/etc/init.d/pgcluster setup

Edit the config files in the data path found above (usually /var/lib/posgresql/pgcluster)

postgres (postmaster)

postgres has 3 configuration files of interest here: postgresql.conf, pg_hba.conf, cluster.conf.

To start it manually:

su postgres cd /var/lib/postgresql/pgcluster pg_ctl -D . start

or

postgres -D .

postgresql.conf

This is the standard postgresql config file. Just make sure of this:

listen_addresses='*'

You might also want to changed the logging settings.

pg_hba.conf

This is also a standard postgresql config file. Make sure all the hosts involved trust each other. Here's an example line you can use to get a test cluster running (it's a bit permissive for production use):

host    all         all         192.168.0.0/16        trust

cluster.conf

This file is new with PGCLuster - it configures the patched database instance.

Add a block like this for each replication server:

<Replicate_Server_Info>
	<Host_Name>     pgrep1  </Host_Name>
	<Port>          8001    </Port>
	<Recovery_Port> 8101    </Recovery_Port>
</Replicate_Server_Info>

Also add the following:

<Host_Name>		pgdb1				</Host_Name>
<Recovery_Port>	7001				</Recovery_Port>
<Rsync_Path>		/usr/bin/rsync			</Rsync_Path>
<Rsync_Option>	        ssh -1				</Rsync_Option>
<Rsync_Compress>	yes				</Rsync_Compress>
<Rsync_Timeout>	10min				</Rsync_Timeout>
<Rsync_Bwlimit>	0KB				</Rsync_Bwlimit>
<Pg_Dump_Path>	        /usr/bin/pg_dump	        </Pg_Dump_Path>
<Ping_Path>		/bin/ping			</Ping_Path>
<When_Stand_Alone>	read_only			</When_Stand_Alone>
<Replication_Timeout>	1min				</Replication_Timeout>
<LifeCheck_Timeout>	3s				</LifeCheck_Timeout>
<LifeCheck_Interval>	11s				</LifeCheck_Interval>

And this bit is optional:

<Not_Replicate_Info>
	<DB_Name>		test_db		</DB_Name>
#	<Table_Name>		log_table	</Table_Name>
</Not_Replicate_Info>

pgreplicate

This process replicates to the other servers.

To start it manually:

su postgres cd /var/lib/postgresql/pgcluster pgreplicate -D .

If you want debug output change the last line to:

pgreplicate -D . -n -v

pgreplicate.conf

Add a block like this for each database (also called cluster) instance:

<Cluster_Server_Info>
   <Host_Name>pgdb1</Host_Name>
   <Port>5432</Port>
   <Recovery_Port>7001</Recovery_Port>
</Cluster_Server_Info>

Add this block for your load balancer instance:

<LoadBalance_Server_Info>
   <Host_Name>pglb1</Host_Name>
   <Recovery_Port>6001</Recovery_Port>
</LoadBalance_Server_Info>

And add this to describe the replicator instance you're configuring:

<Host_Name>pgrep1</Host_Name>
<Replication_Port>8002</Replication_Port>
<Recovery_Port>8101</Recovery_Port>
<RLog_Port>8301</RLog_Port>
<Response_Mode>normal</Response_Mode>
<Use_Replication_Log>no</Use_Replication_Log>

Optional Logging info:

<Log_File_Info>
        <File_Name>pg_log/pgreplicate.log</File_Name>
</Log_File_Info>

pglb

This process provides load balancing.

To start it manually:

su postgres cd /var/lib/postgresql/pgcluster pglb -D .

If you want debug output change the last line to:

pglb -D . -n -v

pglb.conf

Similar to pgreplicate, add the following block for each database instance:

<Cluster_Server_Info>
   <Host_name>pgdb1</Host_Name>
   <Port>5432</Port>
   <Max_Connect>30</Max_Connect>
</Cluster_Server_Info>

Add the following for the load balancer:

<Host_Name>pglb1</Host_Name>
<Backend_Socket_Dir>/tmp</backend_socket_dir>
<Receive_Port>5433</Receive_Port>
<Recovery_Port>6001</Recovery_Port>
<Max_Cluster_Num>128</Max_Cluster_Num>
<Use_Connection_Pooling>no</Use_Connection_Pooling>

Optional Logging info:

<Log_File_Info>
        <File_Name>pg_log/pgreplicate.log</File_Name>
</Log_File_Info>

Recovery

...

References