Monitoring

Delivery Checks

The Remote Settings ecosystem can be monitored from the Delivery Checks dashboard.

Each environment has its own set of checks, and generally speaking if the checks pass, the service is operating without issues.

Note

This is an instance of Telescope, a generic health check service that you can use for your services!

Server Metrics

Servers send live metrics which are visible in Grafana.

We have a remote-settings folder with the main dashboards.

Server Logs

Servers logs are available in the Google Cloud Console Logs Explorer.

Writer Instances

This shows Nginx logs combined with application logs:

resource.type="k8s_container"
labels."k8s-pod/app_kubernetes_io/component"="writer"

To filter out request summaries, and see application logs only:

jsonPayload.Type!="request.summary"

Specific status codes, for example errors:

jsonPayload.Fields.code=~"^(4|5)\d{2,2}$"

Reader Instances

labels."k8s-pod/app_kubernetes_io/component"="reader"

Cronjobs / Lambdas

Filter labels."k8s-pod/app_kubernetes_io/component" with one of the following values:

  • cron-backport-records

  • cron-backport-records-normandy

  • cron-cookie-banner-rules-list

  • cron-refresh-signature

  • cron-remote-settings-mdn-browser-compat-data

  • cron-sync-megaphone

Attachments CDN Logs

httpRequest.requestUrl =~ "attachments"

Clients Telemetry

Clients send us uptake statuses, that we can query and graph over time in Redash.

Redash Queries

Note

Most queries filter on the last X hours with WHERE timestamp > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL {{X}} HOUR) but it’s possible to query a specific time window with:

WHERE timestamp > timestamp '2023-10-24 06:00:00'
  AND timestamp < timestamp '2023-10-24 22:00:00'

Note

These queries may require permissions, don’t hesitate to request access on Slack in #delivery.

Telescope Check Queries

These queries can be used as models when troubleshooting with Redash: