Maintain PostgreSQL¶
Troubleshoot performance¶
Check the log file – for example, /var/log/postgresql/postgresql-15-main.log
– if debugging an unscheduled restart of the postgres
service, for example.
Find slow queries¶
See also
Use the pg_stat_statements extension. For example:
SELECT
usename,
substring(query, 1, 80) AS short_query,
round(total_time::numeric, 2) AS total_time,
calls,
round(mean_time::numeric, 2) AS mean,
round((100 * total_time /
sum(total_time::numeric) OVER ())::numeric, 2) AS percentage_cpu
FROM pg_stat_statements s
INNER JOIN pg_user u ON s.userid = u.usesysid
ORDER BY total_time DESC
LIMIT 20;
To display the full query, you might prefer to switch to unaligned output mode (\a
):
SELECT
usename,
query,
round(total_time::numeric, 2) AS total_time,
calls,
round(mean_time::numeric, 2) AS mean,
round((100 * total_time /
sum(total_time::numeric) OVER ())::numeric, 2) AS percentage_cpu
FROM pg_stat_statements s
INNER JOIN pg_user u ON s.userid = u.usesysid
ORDER BY total_time DESC
LIMIT 20;
To reset the statistics:
SELECT pg_stat_statements_reset();
Tip
pgBadger is also available.
Reference: Tracking down slow queries in PostgreSQL
Tune server settings¶
SSH into a PostgreSQL server as the
postgres
user.Run the
postgresqltuner.pl
file:/var/lib/postgresql/postgresqltuner.pl --ssd
Under Configuration advice, address HIGH and MEDIUM recommendations.
Reference: Tuning Your PostgreSQL Server, Slow Query Questions
Check autovacuum statistics¶
SELECT
nspname,
s.relname,
reltuples,
n_live_tup::real,
n_dead_tup::real,
TRUNC(n_dead_tup / GREATEST(reltuples::numeric, 1) * 100, 2) AS percent,
last_autovacuum,
last_autoanalyze
FROM pg_stat_all_tables s
JOIN pg_class c ON relid = c.oid
JOIN pg_namespace ON relnamespace = pg_namespace.oid
ORDER BY percent DESC, last_autovacuum;
See the pg_stat_all_tables table’s documentation.
To get the table related to a TOAST table, take the number after pg_toast_
, and run, for example:
SELECT '16712'::regclass;
Troubleshoot backups¶
If cron
sends an email like:
ERROR: [082]: WAL segment 00000001000009FD0000000D was not archived before the 60000ms timeout
HINT: check the archive_command to ensure that all options are correct (especially --stanza).
HINT: check the PostgreSQL server log for errors.
HINT: run the 'start' command if the stanza was previously stopped.
(The same message will appear in a log file like /var/log/pgbackrest/STANZA-backup.log
.)
This error occurs occasionally (for example, due to a network issue), and is not a concern. To confirm, after 24 hours, view recent backups with:
sudo -u postgres pgbackrest info --stanza=kingfisher-2023
You should see a “full backup” within one week of the error, an “incr backup” the day before the error, and another the day after. (In other words, an incremental backup failed.)
If a full backup failed, re-run the weekly full backup command listed in the /etc/cron.d/postgres_backups
file.
Check usage¶
Explore databases¶
List databases:
\l
List schemas:
\dn
List tables, views and sequences in the public
schema:
\d
List tables, indexes, views and sequences in the public
schema:
\dtivs
To list tables, views and/or sequences in a specific schema, append, for example, reference.*
– or *.*
for all schema.
You can use the psql
command’s -E
(--echo-hidden
) flag to echo the queries generated by the backslash commands.
Check drive usage¶
Get all database sizes:
\l+
Get all schema sizes:
SELECT
schema_name,
schema_size,
pg_size_pretty(schema_size),
TRUNC(schema_size::numeric / pg_database_size(current_database()) * 100, 2) AS percent
FROM (
SELECT
nspname AS schema_name,
SUM(pg_relation_size(c.oid))::bigint AS schema_size
FROM pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
GROUP BY schema_name
) t
ORDER BY schema_size DESC;
Get relation sizes in the public
schema:
\dtis+
To get relation sizes in a specific schema, append, for example, reference.*
– or *.*
for all schema.
See the Database Object Size Functions documentation.
Show active connections and running queries¶
Show running queries:
SELECT pid, client_addr, usename, application_name, state, wait_event_type, NOW() - query_start AS time, query
FROM pg_stat_activity
WHERE query <> ''
ORDER BY time DESC;
See the pg_stat_activity table’s documentation.
Note
A client_addr
in the range 172.16.0.0 to 172.31.255.255 (private address space) is Docker.
application_name
can suggest an administrator’s manual connection. For example, “psql” or “DataGrip”.
Tip
To see all connections, remove the WHERE
clause and add backend_type
to the SELECT
clause. Typically, the backend_type
with no query
are:
autovacuum launcher
logical replication launcher
background writer
archiver
checkpointer
walwriter
Stop a query, replacing PID
with the query’s pid
:
SELECT pg_cancel_backend(PID)
Find unexpected schema or tables¶
Find unexpected schema:
SELECT nspname
FROM pg_namespace
WHERE
nspname NOT LIKE 'pg_temp_%' AND
nspname NOT LIKE 'pg_toast_temp_%' AND
nspname NOT LIKE 'summary_%' AND
nspname NOT IN (
'information_schema',
'pg_catalog',
'pg_toast',
'public',
'reference',
'summaries'
);
Find unexpected tables in the public schema:
SELECT relname
FROM pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE
nspname = 'public' AND
-- Ignore sequences and indexes
relkind NOT IN ('S', 'i') AND
relname NOT IN (
-- Kingfisher Process tables
'collection',
'collection_file',
'collection_note',
'compiled_release',
'data',
'package_data',
'record',
'record_check',
'release',
'release_check',
-- To be removed in future versions
'alembic_version',
'record_check_error',
'release_check_error',
'transform_upgrade_1_0_to_1_1_status_record',
'transform_upgrade_1_0_to_1_1_status_release',
-- https://www.postgresql.org/docs/current/pgstatstatements.html
'pg_stat_statements',
-- https://www.postgresql.org/docs/current/tablefunc.html
'tablefunc_crosstab_2',
'tablefunc_crosstab_3',
'tablefunc_crosstab_4'
);
Control access¶
Each individual should have a personal account.
Add a user¶
Add, in a private Pillar file, replacing
PASSWORD
with a strong password andUSERNAME
with a recognizable username (for people, use the lowercase first initial and the family name, likejdoe
):postgres: users: # me@example.com USERNAME: password: "PASSWORD"
Assign the user to groups. For example, the
kingfisher-main
target includes the groups:- kingfisher_process_read
SELECT
on all tables in schemapublic
- kingfisher_summarize_read
SELECT
on all tables in schema created by Kingfisher Summarize
postgres: users: # me@example.com USERNAME: password: "PASSWORD" groups: - kingfisher_process_read - kingfisher_summarize_read
Update a password¶
Update the private Pillar file, for example:
postgres: users: # me@example.com USERNAME: password: "PASSWORD"
Notify the contact at the email address in the comment
Delete a user¶
Delete the user from the private Pillar file.
SSH into the main server as the
root
user.Attempt to drop the user as the
postgres
user, for example:su - postgres -c 'psql kingfisher_process -c "DROP ROLE role2drop;"'
If you see a message like:
ERROR: role "role2drop" cannot be dropped because some objects depend on it DETAIL: privileges for table … … and 1234 other objects (see server log for list)
Open the server log, and search for the relevant
DROP ROLE
statement (after running the command below, press/
, typeDROP ROLE
, press Enter, and pressn
until you find it). For example:less /var/log/postgresql/postgresql-15-main.log
If all the objects listed after
DETAIL:
in the server log can be dropped (press Space to scroll forward), then pressq
to quitless
and open a SQL terminal as thepostgres
user:su - postgres -c 'psql kingfisher_process'
Finally, drop the user:
REASSIGN OWNED BY role2drop TO anotheruser; DROP OWNED BY role2drop; DROP ROLE role2drop;
Check privileges¶
List users and groups:
\du
Find unexpected database CREATE
privileges:
SELECT usename, string_agg(datname, ', ' ORDER BY datname)
FROM pg_user
CROSS JOIN pg_database
WHERE
usename NOT IN ('postgres') AND
has_database_privilege(usename, datname, 'CREATE') AND
NOT (usename = 'kingfisher_summarize' AND datname = 'kingfisher_process')
GROUP BY usename
ORDER BY usename;
Find unexpected schema CREATE
privileges:
SELECT usename, string_agg(nspname, ', ' ORDER BY nspname)
FROM pg_user
CROSS JOIN pg_namespace
WHERE
usename NOT IN ('postgres') AND
has_schema_privilege(usename, nspname, 'CREATE') AND
NOT (usename = 'kingfisher_process' AND nspname = 'public') AND
NOT (usename = 'kingfisher_summarize' AND nspname LIKE 'summary_%')
GROUP BY usename
ORDER BY usename;
Find unexpected schema USAGE
privileges:
SELECT usename, string_agg(nspname, ', ' ORDER BY nspname)
FROM pg_user
CROSS JOIN pg_namespace
WHERE
usename NOT IN ('postgres') AND
nspname NOT IN ('information_schema', 'pg_catalog', 'reference', 'summaries') AND
has_schema_privilege(usename, nspname, 'USAGE') AND
NOT (usename = 'kingfisher_summarize' AND nspname LIKE 'summary_%') AND
NOT (pg_has_role(usename, 'kingfisher_process_read', 'MEMBER') AND nspname = 'public') AND
NOT (pg_has_role(usename, 'kingfisher_summarize_read', 'MEMBER') AND nspname LIKE 'summary_%')
GROUP BY usename
ORDER BY usename;
Find unexpected table non SELECT
privileges:
SELECT usename, nspname, string_agg(relname, ', ' ORDER BY relname)
FROM pg_user
CROSS JOIN pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE
usename NOT IN ('postgres') AND
nspname NOT IN ('pg_toast') AND
relname NOT IN ('pg_settings') AND
has_table_privilege(usename, c.oid, 'INSERT,UPDATE,DELETE,TRUNCATE,REFERENCES,TRIGGER') AND
NOT (usename = 'kingfisher_process' AND nspname = 'public') AND
NOT (usename = 'kingfisher_summarize' AND nspname LIKE 'summary_%')
GROUP BY usename, nspname
ORDER BY usename, nspname;
Reference: System Information Functions for functions like has_schema_privilege
Restore from backup¶
pgBackRest¶
See also
Get the stanza name from the postgres:backup:stanza
key in the Pillar data, or from the /etc/pgbackrest/pgbackrest.conf
file on the server.
View current backups:
pgbackrest info --stanza=example
Stop PostgreSQL:
systemctl stop postgres.service
In most cases, restore using the
--delta
option, as it saves time by using hashes to restore only files that differ:sudo -u postgres pgbackrest restore --stanza=example --delta
To restore from a backup other than the latest, for example:
sudo -u postgres pgbackrest restore --stanza=example --delta --set=20210315-145357F_20210315-145459I
To restore every file, for example:
rm -rf /var/lib/postgresql/15/main mkdir /var/lib/postgresql/15/main sudo -u postgres pgbackrest restore --stanza=example
Start PostgreSQL:
systemctl start postgres.service
See also
pg_dump and pg_restore¶
If pgBackRest isn’t an option, use pg_dump
and pg_restore
.
Connect to the old server, and export the existing database. For example, to export
spoonbill_web
:su - postgres -c "/usr/bin/pg_dump -Ft 'spoonbill_web' | gzip > ~/spoonbill_web.tar.gz"
Copy the database dump to your local machine. For example:
rsync -avz root@ocp13.open-contracting.org:/var/lib/postgresql/spoonbill_web.tar.gz .
Copy the database dump to the new server. For example:
rsync -avz spoonbill_web.tar.gz root@ocp27.open-contracting.org:~/
Restore the database. For example:
gunzip ~/spoonbill_web.tar.gz sudo -u postgres pg_restore -cC --if-exists -v -U postgres -d spoonbill_web ~/spoonbill_web.tar
Recover the replica¶
If replication breaks or the replica server goes offline, you must recover the replica, in two stages: mitigate the downtime, and fix the replication.
Mitigate downtime¶
Enable public access to the PostgreSQL service on the main server, by modifying its Pillar file:
postgres: public_access: True
For example, for the
kingfisher-main
target, modify thepillar/kingfisher_main.sls
file.Update DNS records:
Login to GoDaddy
If access was delegated, open Delegate Access and click the Access Now button
Open DNS Management for open-contracting.org
Update the replica’s CNAME record to point to the main server’s A record
Wait for the changes to propagate, which depends on the original TTL value
Fix replication¶
SSH into the replica server as the
root
userStop PostgreSQL:
systemctl stop postgres.service
Download the latest database or a backup from a specific point in time, replacing
kingfisher
with the value set in pillarpostgres:backup:stanza
. pgBackRest is configured to restore the replication configuration (see, for example,/var/lib/postgresql/15/main/postgresql.conf
).pgbackrest --stanza=kingfisher --type=standby --delta restore
Note
See Restore from backup for more information on the pgBackRest restore function.
Start PostgreSQL and monitor:
systemctl start postgres.service tail -f /var/log/postgresql/postgresql-15-main.log
You should see messages about recovering from WAL files in the logs.
If all else fails, rebuild the replica.