Maintain PostgreSQL#
Troubleshoot#
Check the log file, /var/log/postgresql/postgresql-11-main.log
, if debugging an unscheduled restart of the postgres
service, for example.
Find slow queries#
Use the pg_stat_statements extension. For example:
SELECT
usename,
substring(query, 1, 80) AS short_query,
round(total_time::numeric, 2) AS total_time,
calls,
round(mean_time::numeric, 2) AS mean,
round((100 * total_time /
sum(total_time::numeric) OVER ())::numeric, 2) AS percentage_cpu
FROM pg_stat_statements s
INNER JOIN pg_user u ON s.userid = u.usesysid
ORDER BY total_time DESC
LIMIT 20;
To display the full query, you might prefer to switch to unaligned output mode (\a
):
SELECT
usename,
query,
round(total_time::numeric, 2) AS total_time,
calls,
round(mean_time::numeric, 2) AS mean,
round((100 * total_time /
sum(total_time::numeric) OVER ())::numeric, 2) AS percentage_cpu
FROM pg_stat_statements s
INNER JOIN pg_user u ON s.userid = u.usesysid
ORDER BY total_time DESC
LIMIT 20;
To reset the statistics:
SELECT pg_stat_statements_reset();
Reference: Tracking down slow queries in PostgreSQL
Control access#
Each individual should have a personal account, and each service should have a service account.
Add a user#
Add, in a private Pillar file, replacing
PASSWORD
with a strong password andUSERNAME
with a recognizable username (for example, the lowercase first initial and family name of the person, likejdoe
):postgres: users: # me@example.com USERNAME: password: "PASSWORD"
Assign the user to groups. For example, the
kingfisher-main
target has the groups:- kingfisher_process_read
SELECT
on all tables in schemapublic
- kingfisher_summarize_read
SELECT
on all tables in schema created by Kingfisher Summarize
postgres: users: # me@example.com USERNAME: password: "PASSWORD" groups: - kingfisher_process_read - kingfisher_summarize_read
Update a password#
Update the private Pillar file, for example:
postgres: users: # me@example.com USERNAME: password: "PASSWORD"
Notify the contact at the email address in the comment
Delete a user#
Delete the user from the private Pillar file.
SSH into the main server as the
root
user.Attempt to drop the user as the
postgres
user, for example:su - postgres -c 'psql kingfisher_process -c "DROP ROLE role2drop;"'
If you see a message like:
ERROR: role "role2drop" cannot be dropped because some objects depend on it DETAIL: privileges for table … … and 1234 other objects (see server log for list)
Open the server log, and search for the relevant
DROP ROLE
statement (after running the command below, press/
, typeDROP ROLE
, press Enter, and pressn
until you match the relevant statement):less /var/log/postgresql/postgresql-11-main.log
If all the objects listed after
DETAIL:
in the server log can be dropped (press Space to scroll forward), then pressq
to quitless
and open a SQL terminal as thepostgres
user:su - postgres -c 'psql kingfisher_process'
Finally, drop the user:
REASSIGN OWNED BY role2drop TO anotheruser; DROP OWNED BY role2drop; DROP ROLE role2drop;
Check privileges#
List users and groups:
\du
Find unexpected database CREATE
privileges:
SELECT usename, string_agg(datname, ', ' ORDER BY datname)
FROM pg_user
CROSS JOIN pg_database
WHERE
usename NOT IN ('postgres') AND
has_database_privilege(usename, datname, 'CREATE') AND
NOT (usename = 'kingfisher_summarize' AND datname = 'kingfisher_process')
GROUP BY usename
ORDER BY usename;
Find unexpected schema CREATE
privileges:
SELECT usename, string_agg(nspname, ', ' ORDER BY nspname)
FROM pg_user
CROSS JOIN pg_namespace
WHERE
usename NOT IN ('postgres') AND
has_schema_privilege(usename, nspname, 'CREATE') AND
NOT (usename = 'kingfisher_process' AND nspname = 'public') AND
NOT (usename = 'kingfisher_summarize' AND nspname LIKE 'summary_%')
GROUP BY usename
ORDER BY usename;
Find unexpected schema USAGE
privileges:
SELECT usename, string_agg(nspname, ', ' ORDER BY nspname)
FROM pg_user
CROSS JOIN pg_namespace
WHERE
usename NOT IN ('postgres') AND
nspname NOT IN ('information_schema', 'pg_catalog', 'reference', 'summaries') AND
has_schema_privilege(usename, nspname, 'USAGE') AND
NOT (usename = 'kingfisher_summarize' AND nspname LIKE 'summary_%') AND
NOT (pg_has_role(usename, 'kingfisher_process_read', 'MEMBER') AND nspname = 'public') AND
NOT (pg_has_role(usename, 'kingfisher_summarize_read', 'MEMBER') AND nspname LIKE 'summary_%')
GROUP BY usename
ORDER BY usename;
Find unexpected table non SELECT
privileges:
SELECT usename, nspname, string_agg(relname, ', ' ORDER BY relname)
FROM pg_user
CROSS JOIN pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE
usename NOT IN ('postgres') AND
nspname NOT IN ('pg_toast') AND
relname NOT IN ('pg_settings') AND
has_table_privilege(usename, c.oid, 'INSERT,UPDATE,DELETE,TRUNCATE,REFERENCES,TRIGGER') AND
NOT (usename = 'kingfisher_process' AND nspname = 'public') AND
NOT (usename = 'kingfisher_summarize' AND nspname LIKE 'summary_%')
GROUP BY usename, nspname
ORDER BY usename, nspname;
Reference: System Information Functions for functions like has_schema_privilege
Improve performance#
Tune settings#
#. SSH into a PostgreSQL server as the postgres
user.
- Run the postgresqltuner.pl
file:
/var/lib/postgresql/postgresqltuner.pl --ssd
Under “Configuration advice”, address “HIGH” and “MEDIUM” recommendations.
Note
pgBadger is also available.
Reference: Tuning Your PostgreSQL Server
Reference: Slow Query Questions
Check autovacuum statistics#
SELECT
nspname,
s.relname,
reltuples,
n_live_tup::real,
n_dead_tup::real,
TRUNC(n_dead_tup / GREATEST(reltuples::numeric, 1) * 100, 2) AS percent,
last_autovacuum,
last_autoanalyze
FROM pg_stat_all_tables s
JOIN pg_class c ON relid = c.oid
JOIN pg_namespace ON relnamespace = pg_namespace.oid
ORDER BY percent DESC, last_autovacuum;
See the pg_stat_all_tables table’s documentation.
To get the table related to a pg_toast_*
table, take the number after pg_toast_
, and run, for example:
SELECT '16712'::regclass;
Check usage#
Explore database#
List databases:
\l
List schemas:
\dn
List tables, views and sequences in the public
schema:
\d
List tables, indexes, views and sequences in the public
schema:
\dtivs
To list tables, views and/or sequences in a specific schema, append, for example, reference.*
– or append *.*
for all schema.
You can use the psql
command’s -E
(--echo-hidden
) flag to echo the queries generated by the backslash commands.
Check drive usage#
Get all database sizes:
\l+
Get all schema sizes:
SELECT
schema_name,
schema_size,
pg_size_pretty(schema_size),
TRUNC(schema_size::numeric / pg_database_size(current_database()) * 100, 2) AS percent
FROM (
SELECT
nspname AS schema_name,
SUM(pg_relation_size(c.oid))::bigint AS schema_size
FROM pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
GROUP BY schema_name
) t
ORDER BY schema_size DESC;
Get relation sizes in the public
schema:
\dtis+
To get relation sizes in a specific schema, append, for example, reference.*
– or append *.*
for all schema.
See the Database Object Size Functions documentation.
Show running queries#
Show running queries:
SELECT pid, client_addr, usename, state, wait_event_type, NOW() - query_start AS time, query
FROM pg_stat_activity
WHERE query <> ''
ORDER BY time DESC;
Stop a query, replacing PID
with the query’s pid
:
SELECT pg_cancel_backend(PID)
See the pg_stat_activity table’s documentation.
Find unexpected schema:
SELECT nspname
FROM pg_namespace
WHERE
nspname NOT LIKE 'pg_temp_%' AND
nspname NOT LIKE 'pg_toast_temp_%' AND
nspname NOT LIKE 'summary_%' AND
nspname NOT IN (
'information_schema',
'pg_catalog',
'pg_toast',
'public',
'reference',
'summaries'
);
Find unexpected tables in the public schema:
SELECT relname
FROM pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE
nspname = 'public' AND
-- Ignore sequences and indices
relkind NOT IN ('S', 'i') AND
relname NOT IN (
-- Kingfisher Process tables
'collection',
'collection_file',
'collection_file_item',
'collection_note',
'compiled_release',
'data',
'package_data',
'record',
'record_check',
'release',
'release_check',
-- To be removed in future versions
'alembic_version',
'record_check_error',
'release_check_error',
'transform_upgrade_1_0_to_1_1_status_record',
'transform_upgrade_1_0_to_1_1_status_release',
-- https://www.postgresql.org/docs/current/pgstatstatements.html
'pg_stat_statements',
-- https://www.postgresql.org/docs/current/tablefunc.html
'tablefunc_crosstab_2',
'tablefunc_crosstab_3',
'tablefunc_crosstab_4'
);
Restore from backup#
PostgreSQL databases are backed up offsite. Backup and restoration are managed by pgBackRest. These are the main commands for working with pgbackrest.
Note
For more information on setting up backups, see Set up backups.
The stanza name is defined in pillar postgres:backup:stanza
.
You can also find it in the pgbackrest config /etc/pgbackrest/pgbackrest.conf
.
View current backups:
pgbackrest info --stanza=example
Restore from backup:
pgbackrest restore --stanza=example --delta
Restore specific backup by timestamp:
pgbackrest restore --stanza=example --set=20210315-145357F_20210315-145459I --delta
The --delta
flag saves time when restoring by checking file hashes and only restoring the files it needs to.
If you want to restore every file from the backup, for example if you are restoring to a new server, it may be quicker to not use deltas.
You can run a full restore following this process:
rm -rf /var/lib/postgresql/11/main
mkdir /var/lib/postgresql/11/main
pgbackrest restore --stanza=example
Recover the replica#
If replication breaks or the replica server goes offline, you must recover the replica, in two stages: mitigate the downtime, and fix the replication.
Mitigate downtime#
Enable public access to the PostgreSQL service on the main server, by modifying its Pillar file:
postgres: public_access: True
For example, for the
kingfisher-main
target, modify thepillar/kingfisher_main.sls
file.Update DNS records:
Login to GoDaddy
If access was delegated, open Delegate Access and click the Access Now button
Open DNS Management for open-contracting.org
Update the replica’s CNAME record to point to the main server’s A record: for example, point
postgres-readonly
toprocess1.kingfisher
Wait for the changes to propagate, which depends on the original TTL value
Fix replication#
SSH into the replica server as the
root
user.Stop PostgreSQL if it is running:
systemctl stop postgres.service
Download the latest database or a backup from a specific point in time
In this example I’m restoring
kingfisher
, to restore a different instance, replacekingfisher
with the value set in pillarpostgres:backup:stanza
. pgbackrest is pre-configured to restore the replication configuration (/var/lib/postgresql/11/main/postgresql.conf
).pgbackrest --stanza=kingfisher --type=standby --delta restore
Note
See Restore from backup for more information on the pgbackrest restore function.
Start PostgreSQL and monitor:
systemctl start postgres.service tail -f /var/log/postgresql/postgresql-11-main.log
You should see messages about recovering from WAL files in the logs.
If all else fails, you can fallback to rebuilding the replica. See Set up replication.