Maintain PostgreSQL

Troubleshoot

Check the log file, /var/log/postgresql/postgresql-11-main.log, if debugging an unscheduled restart of the postgres service, for example.

Control access

Each individual should have a personal account, and each service should have a service account.

Add a user

  1. Add, in a private Pillar file, replacing PASSWORD with a strong password and USERNAME with a recognizable username (for example, the lowercase first initial and family name of the person, like jdoe):

    postgres:
      users:
        # me@example.com
        USERNAME:
          password: "PASSWORD"
    
  2. Assign the user to groups. For example, the kingfisher-process target has the groups:

    kingfisher_process_read

    SELECT on all tables in schema public

    kingfisher_summarize_read

    SELECT on all tables in schema created by Kingfisher Summarize

    postgres:
      users:
        # me@example.com
        USERNAME:
          password: "PASSWORD"
          groups:
            - kingfisher_process_read
            - kingfisher_summarize_read
    
  3. Deploy the service

Update password

  1. Update the private Pillar file, for example:

    postgres:
      users:
        # me@example.com
        USERNAME:
          password: "PASSWORD"
    
  2. Deploy the service

  3. Notify the contact at the email address in the comment

Delete a user

  1. Delete the user from the private Pillar file

  2. Connect to the server as the root user, for example:

    curl --silent --connect-timeout 1 process.kingfisher.open-contracting.org:8255 || true
    ssh root@process.kingfisher.open-contracting.org
    
  3. Attempt to drop the user as the postgres user, for example:

    su - postgres -c 'psql ocdskingfisherprocess -c "DROP ROLE ocdskfpguest;"'
    
  4. If you see a message like:

    ERROR:  role "ocdskfpguest" cannot be dropped because some objects depend on it
    DETAIL:  privileges for table …
    …
    and 1234 other objects (see server log for list)
    
  5. Open the server log, and search for the relevant DROP ROLE statement (after running the command below, press /, type DROP ROLE, press Enter, and press n until you match the relevant statement):

    less /var/log/postgresql/postgresql-11-main.log
    
  6. If all the objects listed after DETAIL: in the server log can be dropped (press Space to scroll forward), then press q to quit less and open a SQL terminal as the postgres user:

    su - postgres -c 'psql ocdskingfisherprocess'
    
  7. Finally, drop the user:

    REASSIGN OWNED BY ocdskfpguest TO anotheruser;
    DROP OWNED BY ocdskfpguest;
    DROP ROLE ocdskfpguest;
    

Check privileges

List users and groups:

\du

Find unexpected database CREATE privileges:

SELECT usename, string_agg(datname, ', ' ORDER BY datname)
FROM pg_user
CROSS JOIN pg_database
WHERE
    usename NOT IN ('postgres') AND
    has_database_privilege(usename, datname, 'CREATE') AND
    NOT (usename = 'kingfisher_summarize' AND datname = 'ocdskingfisherprocess')
GROUP BY usename
ORDER BY usename;

Find unexpected schema CREATE privileges:

SELECT usename, string_agg(nspname, ', ' ORDER BY nspname)
FROM pg_user
CROSS JOIN pg_namespace
WHERE
    usename NOT IN ('postgres') AND
    has_schema_privilege(usename, nspname, 'CREATE') AND
    NOT (usename = 'kingfisher_process' AND nspname = 'public') AND
    NOT (usename = 'kingfisher_summarize' AND nspname LIKE 'view_data_%')
GROUP BY usename
ORDER BY usename;

Find unexpected schema USAGE privileges:

SELECT usename, string_agg(nspname, ', ' ORDER BY nspname)
FROM pg_user
CROSS JOIN pg_namespace
WHERE
    usename NOT IN ('postgres') AND
    nspname NOT IN ('information_schema', 'pg_catalog', 'reference') AND
    has_schema_privilege(usename, nspname, 'USAGE') AND
    NOT (usename = 'kingfisher_summarize' AND nspname LIKE 'view_data_%') AND
    NOT (pg_has_role(usename, 'kingfisher_process_read', 'MEMBER') AND nspname = 'public') AND
    NOT (pg_has_role(usename, 'kingfisher_summarize_read', 'MEMBER') AND nspname LIKE 'view_data_%')
GROUP BY usename
ORDER BY usename;

Find unexpected table non SELECT privileges:

SELECT usename, nspname, string_agg(relname, ', ' ORDER BY relname)
FROM pg_user
CROSS JOIN pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE
    usename NOT IN ('postgres') AND
    nspname NOT IN ('pg_toast') AND
    relname NOT IN ('pg_settings') AND
    has_table_privilege(usename, c.oid, 'INSERT,UPDATE,DELETE,TRUNCATE,REFERENCES,TRIGGER') AND
    NOT (usename = 'kingfisher_process' AND nspname = 'public') AND
    NOT (usename = 'kingfisher_summarize' AND nspname LIKE 'view_data_%')
GROUP BY usename, nspname
ORDER BY usename, nspname;

Reference: System Information Functions for functions like has_schema_privilege

Improve performance

Tune settings

  • Connect to the server

  • Change to the postgres user:

    su - postgres
    
  • Download the postgresqltuner.sql file (if not available):

    curl -O https://raw.githubusercontent.com/jfcoz/postgresqltuner/master/postgresqltuner.pl
    
  • Make the postgresqltuner.sql file executable:

    chmod ug+x postgresqltuner.pl
    
  • Run the postgresqltuner.sql file:

    ./postgresqltuner.sql --ssd
    

Under “Configuration advice”, address “HIGH” and “MEDIUM” recommendations.

Reference: Tuning Your PostgreSQL Server

Reference: Slow Query Questions

Check autovacuum statistics

SELECT
    nspname,
    s.relname,
    reltuples,
    n_live_tup::real,
    n_dead_tup::real,
    TRUNC(n_dead_tup / GREATEST(reltuples::numeric, 1) * 100, 2) AS percent,
    last_autovacuum,
    last_autoanalyze
FROM pg_stat_all_tables s
JOIN pg_class c ON relid = c.oid
JOIN pg_namespace ON relnamespace = pg_namespace.oid
ORDER BY percent DESC, last_autovacuum;

See the pg_stat_all_tables table’s documentation.

To get the table related to a pg_toast_* table, take the number after pg_toast_, and run, for example:

SELECT '16712'::regclass;

Check usage

Explore database

List databases:

\l

List schemas:

\dn

List tables, views and sequences in the public schema:

\d

List tables, indexes, views and sequences in the public schema:

\dtivs

To list tables, views and/or sequences in a specific schema, append, for example, views.* – or append *.* for all schema.

You can use the psql command’s -E (--echo-hidden) flag to echo the queries generated by the backslash commands.

Check disk usage

Get all database sizes:

\l+

Get all schema sizes:

SELECT
    schema_name,
    schema_size,
    pg_size_pretty(schema_size),
    TRUNC(schema_size::numeric / pg_database_size(current_database()) * 100, 2) AS percent
FROM (
    SELECT
        nspname AS schema_name,
        SUM(pg_relation_size(c.oid))::bigint AS schema_size
    FROM pg_class c
    JOIN pg_namespace n ON c.relnamespace = n.oid
    GROUP BY schema_name
) t
ORDER BY schema_size DESC;

Get relation sizes in the public schema:

\dtis+

To get relation sizes in a specific schema, append, for example, views.* – or append *.* for all schema.

See the Database Object Size Functions documentation.

Show running queries

Show running queries:

SELECT pid, client_addr, usename, state, wait_event_type, NOW() - query_start AS time, query
FROM pg_stat_activity
WHERE query <> ''
ORDER BY time DESC;

See the pg_stat_activity table’s documentation.

Recover the replica

If replication breaks or the replica server goes offline, you must recover the replica, in two stages: mitigate the downtime, and fix the replication.

Mitigate downtime

  1. Enable public access to the PostgreSQL service on the main server, by modifying its Pillar file:

    postgres:
      public_access: True
    

    For example, for the kingfisher-process target, modify the pillar/kingfisher.sls file.

  2. Deploy the main server

  3. Update DNS records:

    1. Login to GoDaddy
    2. If access was delegated, open Delegate Access and click the Access Now button
    3. Open DNS Management for open-contracting.org
    4. Update the replica’s CNAME record to point to the main server’s A record: for example, point postgres-readonly to kingfisher-process1
    5. Wait for the changes to propagate, which depends on the original TTL value

Fix replication

  1. Copy WAL archives from the main server to the replica server, replacing example.open-contracting.org below with the main server’s hostname:

    Note

    The postgres user on the replica server must have an SSH key pair, and its public key must be an authorized key of the postgres user on the main server. See Create SSH keys for replica recovery.

    service postgres stop
    sudo su - postgres
    timeout 1 ssh postgres@example.open-contracting.org -p 8255
    rsync -azv postgres@example.open-contracting.org:/var/lib/postgresql/11/main/archive/ /var/lib/postgresql/11/main/archive/
    exit
    service postgres start
    
  2. Monitor the replica logs. You should see messages about recovery from WAL files.

    tail -f /var/log/postgresql/postgresql-11-main.log
    

If all else fails, you can fallback to rebuilding the replica. See Set up replication.