From zero to a working panel in under 10 minutes.
A single-page quickstart. Connect a metrics source, write a chartlessops.yml, push to your repo. The panel updates as soon as the YAML lands.
Your first panel
Sign up, give us a metrics source (we’ll need a read-only API token or Prometheus endpoint). The first signal lights up about 90 seconds later.
From there you write your chartlessops.yml — one entry per service — and commit it to your infra repo. We watch the file via a GitHub / GitLab webhook and apply changes within a minute.
chartlessops.yml
The whole panel is configured by one YAML file. Here’s a minimal version:
# chartlessops.yml workspace: acme sources: - name: prod-prometheus type: prometheus url: https://prom.acme.dev auth: ${PROM_TOKEN} services: - name: api-gateway signal: type: p99_latency_ms query: histogram_quantile(0.99, rate(http_duration_seconds_bucket{svc="api"}[5m])) slo: target: 99.95% window: 30d alert: routes: - pagerduty:p-acme-platform - slack:#oncall-platform
The signal becomes one row on the panel. The SLO budget shows under the row. The alert routes fire when the signal crosses a threshold or the SLO budget burns past a configured rate.
Defining a signal
A signal is the one thing that answers “is this service OK?” for this service. Built-in signal types:
p50_latency_ms/p95_latency_ms/p99_latency_ms— latency percentileserror_rate— ratio of bad responsesavailability— ratio of good responsesqueue_depth— backlog sizethroughput— rate / secondcustom— you provide a query and a threshold
One signal per service. Always. If you want a second, that’s a second service.
SLO budgets
SLO budgets are calculated rolling over the configured window (typically 30 days). When the budget is being burned faster than the window allows, the row shows “burning fast” with an estimated time to budget exhaustion.
slo: target: 99.95% # availability target window: 30d # rolling window burn_alert_threshold: 14d # alert if budget will exhaust in <14d at current rate
Prometheus
Provide a base URL + an optional bearer token. We push GET /api/v1/query on your interval. Multi-cluster federation supported via a list of URLs that get queried in parallel and rolled up.
Datadog
API + APP key from the Datadog UI. We query the metrics API. Compatible with their SLO definitions — if you have one, you can reference it directly with datadog_slo:<id>.
CloudWatch
IAM role with cloudwatch:GetMetricData. We assume the role using STS. Multi-account: provide one IAM role per account, namespace-prefix services to disambiguate.
OpenTelemetry
OTLP push or pull. Push: point your OTel collector at our ingest endpoint. Pull: we periodically poll a Prometheus-compatible endpoint exposed by your collector.
Alert routing
Routes are listed in YAML per service. A signal crossing a threshold fires every listed route. Recovery fires the same routes with the resolved state.
alert: on_amber: - slack:#oncall-platform on_red: - pagerduty:p-acme-platform - slack:#oncall-platform - slack:#leadership-incidents on_slo_budget_burn: - email:platform-leads@acme.dev
Multi-region rollups
If a service runs in multiple regions, the panel shows one row with the worst region driving the status. Click into the row to see per-region detail.
On-prem deploy
Enterprise plans can run ChartlessOps in your own VPC. Containerised, deployed via Helm or Docker Compose, configured by environment variables + the same chartlessops.yml. Data never leaves your network.
Getting help
Email support@chartlessops.com. For incident-time issues, incident@chartlessops.com routes to oncall.