Monitoring¶
Delivery Checks¶
The Remote Settings ecosystem can be monitored from the Delivery Checks dashboard.
Each environment has its own set of checks, and generally speaking if the checks pass, the service is operating without issues.
Note
This is an instance of Telescope, a generic health check service that you can use for your services!
Server Metrics¶
Servers send live metrics which are visible in Grafana.
We have a remote-settings folder with the main dashboards.
Server Logs¶
Servers logs are available in the Google Cloud Console Logs Explorer.
Writer Instances¶
This shows Nginx logs combined with application logs:
resource.type="k8s_container"
labels."k8s-pod/app_kubernetes_io/component"="writer"
To filter out request summaries, and see application logs only:
jsonPayload.Type!="request.summary"
Specific status codes, for example errors:
jsonPayload.Fields.code=~"^(4|5)\d{2,2}$"
Reader Instances¶
labels."k8s-pod/app_kubernetes_io/component"="reader"
Cronjobs / Lambdas¶
Filter labels."k8s-pod/app_kubernetes_io/component"
with one of the following values:
cron-backport-records
cron-backport-records-normandy
cron-cookie-banner-rules-list
cron-refresh-signature
cron-remote-settings-mdn-browser-compat-data
cron-sync-megaphone
Attachments CDN Logs¶
httpRequest.requestUrl =~ "attachments"
Clients Telemetry¶
Clients send us uptake statuses, that we can query and graph over time in Redash.
Redash Queries¶
Note
Most queries filter on the last X hours with WHERE timestamp > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL {{X}} HOUR)
but it’s possible to query a specific time window with:
WHERE timestamp > timestamp '2023-10-24 06:00:00'
AND timestamp < timestamp '2023-10-24 22:00:00'
Note
These queries may require permissions, don’t hesitate to request access on Slack in #delivery
.
Telescope Check Queries¶
These queries can be used as models when troubleshooting with Redash: