Maintain PostgreSQL

Troubleshoot

Check the log file, /var/log/postgresql/postgresql-11-main.log, if debugging an unscheduled restart of the postgres service, for example.

Find slow queries

Use the pg_stat_statements extension. For example:

SELECT
    usename,
    substring(query, 1, 80) AS short_query,
    round(total_time::numeric, 2) AS total_time,
    calls,
    round(mean_time::numeric, 2) AS mean,
    round((100 * total_time /
    sum(total_time::numeric) OVER ())::numeric, 2) AS percentage_cpu
FROM pg_stat_statements s
INNER JOIN pg_user u ON s.userid = u.usesysid
ORDER BY total_time DESC
LIMIT 20;

To display the full query, you might prefer to switch to unaligned output mode (\a):

SELECT
    usename,
    query,
    round(total_time::numeric, 2) AS total_time,
    calls,
    round(mean_time::numeric, 2) AS mean,
    round((100 * total_time /
    sum(total_time::numeric) OVER ())::numeric, 2) AS percentage_cpu
FROM pg_stat_statements s
INNER JOIN pg_user u ON s.userid = u.usesysid
ORDER BY total_time DESC
LIMIT 20;

To reset the statistics:

SELECT pg_stat_statements_reset();

Reference: Tracking down slow queries in PostgreSQL

Control access

Each individual should have a personal account, and each service should have a service account.

Add a user

  1. Add, in a private Pillar file, replacing PASSWORD with a strong password and USERNAME with a recognizable username (for example, the lowercase first initial and family name of the person, like jdoe):

    postgres:
      users:
        # me@example.com
        USERNAME:
          password: "PASSWORD"
    
  2. Assign the user to groups. For example, the kingfisher-main target has the groups:

    kingfisher_process_read

    SELECT on all tables in schema public

    kingfisher_summarize_read

    SELECT on all tables in schema created by Kingfisher Summarize

    postgres:
      users:
        # me@example.com
        USERNAME:
          password: "PASSWORD"
          groups:
            - kingfisher_process_read
            - kingfisher_summarize_read
    
  3. Deploy the service

Update a password

  1. Update the private Pillar file, for example:

    postgres:
      users:
        # me@example.com
        USERNAME:
          password: "PASSWORD"
    
  2. Deploy the service

  3. Notify the contact at the email address in the comment

Delete a user

  1. Delete the user from the private Pillar file.

  2. SSH into the main server as the root user.

  3. Attempt to drop the user as the postgres user, for example:

    su - postgres -c 'psql kingfisher_process -c "DROP ROLE role2drop;"'
    
  4. If you see a message like:

    ERROR:  role "role2drop" cannot be dropped because some objects depend on it
    DETAIL:  privileges for table …
    …
    and 1234 other objects (see server log for list)
    
  5. Open the server log, and search for the relevant DROP ROLE statement (after running the command below, press /, type DROP ROLE, press Enter, and press n until you match the relevant statement):

    less /var/log/postgresql/postgresql-11-main.log
    
  6. If all the objects listed after DETAIL: in the server log can be dropped (press Space to scroll forward), then press q to quit less and open a SQL terminal as the postgres user:

    su - postgres -c 'psql kingfisher_process'
    
  7. Finally, drop the user:

    REASSIGN OWNED BY role2drop TO anotheruser;
    DROP OWNED BY role2drop;
    DROP ROLE role2drop;
    

Check privileges

List users and groups:

\du

Find unexpected database CREATE privileges:

SELECT usename, string_agg(datname, ', ' ORDER BY datname)
FROM pg_user
CROSS JOIN pg_database
WHERE
    usename NOT IN ('postgres') AND
    has_database_privilege(usename, datname, 'CREATE') AND
    NOT (usename = 'kingfisher_summarize' AND datname = 'kingfisher_process')
GROUP BY usename
ORDER BY usename;

Find unexpected schema CREATE privileges:

SELECT usename, string_agg(nspname, ', ' ORDER BY nspname)
FROM pg_user
CROSS JOIN pg_namespace
WHERE
    usename NOT IN ('postgres') AND
    has_schema_privilege(usename, nspname, 'CREATE') AND
    NOT (usename = 'kingfisher_process' AND nspname = 'public') AND
    NOT (usename = 'kingfisher_summarize' AND nspname LIKE 'summary_%')
GROUP BY usename
ORDER BY usename;

Find unexpected schema USAGE privileges:

SELECT usename, string_agg(nspname, ', ' ORDER BY nspname)
FROM pg_user
CROSS JOIN pg_namespace
WHERE
    usename NOT IN ('postgres') AND
    nspname NOT IN ('information_schema', 'pg_catalog', 'reference', 'summaries') AND
    has_schema_privilege(usename, nspname, 'USAGE') AND
    NOT (usename = 'kingfisher_summarize' AND nspname LIKE 'summary_%') AND
    NOT (pg_has_role(usename, 'kingfisher_process_read', 'MEMBER') AND nspname = 'public') AND
    NOT (pg_has_role(usename, 'kingfisher_summarize_read', 'MEMBER') AND nspname LIKE 'summary_%')
GROUP BY usename
ORDER BY usename;

Find unexpected table non SELECT privileges:

SELECT usename, nspname, string_agg(relname, ', ' ORDER BY relname)
FROM pg_user
CROSS JOIN pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE
    usename NOT IN ('postgres') AND
    nspname NOT IN ('pg_toast') AND
    relname NOT IN ('pg_settings') AND
    has_table_privilege(usename, c.oid, 'INSERT,UPDATE,DELETE,TRUNCATE,REFERENCES,TRIGGER') AND
    NOT (usename = 'kingfisher_process' AND nspname = 'public') AND
    NOT (usename = 'kingfisher_summarize' AND nspname LIKE 'summary_%')
GROUP BY usename, nspname
ORDER BY usename, nspname;

Reference: System Information Functions for functions like has_schema_privilege

Improve performance

Tune settings

#. SSH into a PostgreSQL server as the postgres user. - Run the postgresqltuner.pl file:

/var/lib/postgresql/postgresqltuner.pl --ssd

Under “Configuration advice”, address “HIGH” and “MEDIUM” recommendations.

Note

pgBadger is also available.

Reference: Tuning Your PostgreSQL Server

Reference: Slow Query Questions

Check autovacuum statistics

SELECT
    nspname,
    s.relname,
    reltuples,
    n_live_tup::real,
    n_dead_tup::real,
    TRUNC(n_dead_tup / GREATEST(reltuples::numeric, 1) * 100, 2) AS percent,
    last_autovacuum,
    last_autoanalyze
FROM pg_stat_all_tables s
JOIN pg_class c ON relid = c.oid
JOIN pg_namespace ON relnamespace = pg_namespace.oid
ORDER BY percent DESC, last_autovacuum;

See the pg_stat_all_tables table’s documentation.

To get the table related to a pg_toast_* table, take the number after pg_toast_, and run, for example:

SELECT '16712'::regclass;

Check usage

Explore database

List databases:

\l

List schemas:

\dn

List tables, views and sequences in the public schema:

\d

List tables, indexes, views and sequences in the public schema:

\dtivs

To list tables, views and/or sequences in a specific schema, append, for example, reference.* – or append *.* for all schema.

You can use the psql command’s -E (--echo-hidden) flag to echo the queries generated by the backslash commands.

Check drive usage

Get all database sizes:

\l+

Get all schema sizes:

SELECT
    schema_name,
    schema_size,
    pg_size_pretty(schema_size),
    TRUNC(schema_size::numeric / pg_database_size(current_database()) * 100, 2) AS percent
FROM (
    SELECT
        nspname AS schema_name,
        SUM(pg_relation_size(c.oid))::bigint AS schema_size
    FROM pg_class c
    JOIN pg_namespace n ON c.relnamespace = n.oid
    GROUP BY schema_name
) t
ORDER BY schema_size DESC;

Get relation sizes in the public schema:

\dtis+

To get relation sizes in a specific schema, append, for example, reference.* – or append *.* for all schema.

See the Database Object Size Functions documentation.

Show running queries

Show running queries:

SELECT pid, client_addr, usename, state, wait_event_type, NOW() - query_start AS time, query
FROM pg_stat_activity
WHERE query <> ''
ORDER BY time DESC;

Stop a query, replacing PID with the query’s pid:

SELECT pg_cancel_backend(PID)

See the pg_stat_activity table’s documentation.

Find unexpected schema:

SELECT nspname
FROM pg_namespace
WHERE
    nspname NOT LIKE 'pg_temp_%' AND
    nspname NOT LIKE 'pg_toast_temp_%' AND
    nspname NOT LIKE 'summary_%' AND
    nspname NOT IN (
        'information_schema',
        'pg_catalog',
        'pg_toast',
        'public',
        'reference',
        'summaries'
    );

Find unexpected tables in the public schema:

SELECT relname
FROM pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE
    nspname = 'public' AND
    -- Ignore sequences and indices
    relkind NOT IN ('S', 'i') AND
    relname NOT IN (
        -- Kingfisher Process tables
        'collection',
        'collection_file',
        'collection_file_item',
        'collection_note',
        'compiled_release',
        'data',
        'package_data',
        'record',
        'record_check',
        'release',
        'release_check',
        -- To be removed in future versions
        'alembic_version',
        'record_check_error',
        'release_check_error',
        'transform_upgrade_1_0_to_1_1_status_record',
        'transform_upgrade_1_0_to_1_1_status_release',
        -- https://www.postgresql.org/docs/current/pgstatstatements.html
        'pg_stat_statements',
        -- https://www.postgresql.org/docs/current/tablefunc.html
        'tablefunc_crosstab_2',
        'tablefunc_crosstab_3',
        'tablefunc_crosstab_4'
    );

Restore from backup

PostgreSQL databases are backed up offsite. Backup and restoration are managed by pgBackRest. These are the main commands for working with pgbackrest.

Note

For more information on setting up backups, see Set up backups.

The stanza name is defined in pillar postgres:backup:stanza. You can also find it in the pgbackrest config /etc/pgbackrest/pgbackrest.conf.

View current backups:

pgbackrest info --stanza=example

Restore from backup:

pgbackrest restore --stanza=example --delta

Restore specific backup by timestamp:

pgbackrest restore --stanza=example --set=20210315-145357F_20210315-145459I --delta

The --delta flag saves time when restoring by checking file hashes and only restoring the files it needs to. If you want to restore every file from the backup, for example if you are restoring to a new server, it may be quicker to not use deltas.

You can run a full restore following this process:

rm -rf /var/lib/postgresql/11/main
mkdir /var/lib/postgresql/11/main
pgbackrest restore --stanza=example

Recover the replica

If replication breaks or the replica server goes offline, you must recover the replica, in two stages: mitigate the downtime, and fix the replication.

Mitigate downtime

  1. Enable public access to the PostgreSQL service on the main server, by modifying its Pillar file:

    postgres:
      public_access: True
    

    For example, for the kingfisher-main target, modify the pillar/kingfisher_main.sls file.

  2. Deploy the main server

  3. Update DNS records:

    1. Login to GoDaddy

    2. If access was delegated, open Delegate Access and click the Access Now button

    3. Open DNS Management for open-contracting.org

    4. Update the replica’s CNAME record to point to the main server’s A record: for example, point postgres-readonly to process1.kingfisher

    5. Wait for the changes to propagate, which depends on the original TTL value

Fix replication

  1. SSH into the replica server as the root user.

  2. Stop PostgreSQL if it is running:

    systemctl stop postgres.service
    
  3. Download the latest database or a backup from a specific point in time

    In this example I’m restoring kingfisher, to restore a different instance, replace kingfisher with the value set in pillar postgres:backup:stanza. pgbackrest is pre-configured to restore the replication configuration (/var/lib/postgresql/11/main/postgresql.conf).

    pgbackrest --stanza=kingfisher --type=standby --delta restore
    

    Note

    See Restore from backup for more information on the pgbackrest restore function.

  4. Start PostgreSQL and monitor:

    systemctl start postgres.service
    tail -f /var/log/postgresql/postgresql-11-main.log
    

    You should see messages about recovering from WAL files in the logs.

If all else fails, you can fallback to rebuilding the replica. See Set up replication.