Reference

Monitoring

See also

Google Workspace

Prometheus

Servers are monitored by Prometheus. Salt is used to:

  • Install a Node Exporter service on each server, to export hardware and OS metrics like disk space used, memory used, etc.

  • Set up a Prometheus server to collect metrics from all servers, and to email alerts if metrics are out of bounds

Read the user guide to learn how to use Prometheus.

Sentry

Application errors are reported to Sentry, which notifies individual email addresses. All Salt-managed, OCP-authored services report errors to Sentry. See the Software Development Handbook for access to Sentry.

Tip

From the All Events tab of an issue, to filter out frequent events to find infrequent events:

  1. Click the … button in the TITLE column

  2. Click the Exclude from filter menu item

  3. If needed, replace the end of the title with the wildcard character *

You can also type a negated key like !message:, and Sentry will display autocomplete options.

SecurityScorecard

Cybersecurity issues are monitored by SecurityScorecard. Patching cadence issues are mostly false positives. To dismiss such issues:

  1. Check the checkboxes in the table

  2. Click the Other resolutions dropdown

  3. Click the I cannot reproduce this issue and I think it’s incorrect item

  4. Add the comment: The software is patched/backported.

  5. Click the Submit button

Hosting

All servers (not services) are managed by Dogsbody Technology (sysadmin@dogsbody.com). Servers are hosted by:

  • Hetzner for hardware servers (Network status)

  • Linode for VPS servers

    • Network status: The relevant systems are: Regions: EU-West (London), Backups: EU-West (London) Backups.

    • Access: The ‘opencontractingpartnership’ and ‘opencontracting-dogsbody’ users have full access.

    • Backups: It is configured to have one daily backup and two weekly backups.

Unmanaged services are:

Administrative access

See also

Software Development Handbook, for access to third-party services

The staff of the following organizations have had administrative roles:

Root access

Server owners (OCP) and server managers (Dogsbody for Linux, RBC for Windows) should have root access. Otherwise, only developers who are reasonably expected to deploy to a development server should have root access to that server; anyone with root access can grant that developer root access.

Root access should be routinely reviewed. If a developer did not deploy (and was not granted root access) to a server within the last six months, their root access to that server should be revoked.

The ssh.root lists in Pillar files and the ssh.admin list in the pillar/common.sls file give people access to servers. All people should belong to the above organizations.

Redash

There should be a minimum of two admin members from OCP only. Users should belong to a single group. Non-admin staff of OCP should belong to the unrestricted group.