Monitoring
Metrics

Metrics

Monitor your database performance with real-time metrics.

Get Metrics

GET /v1/databases/{id}/metrics

curl https://api.cloudheed.com/v1/databases/db-abc123/metrics \
  -H "Authorization: Bearer YOUR_TOKEN"

Query Parameters

ParameterTypeDefaultDescription
periodstring1hTime period (5m, 1h, 24h, 7d, 30d)
resolutionstringautoData resolution (1m, 5m, 1h)

Response

{
  "database_id": "db-abc123",
  "period": "1h",
  "metrics": {
    "cpu_percent": {
      "current": 12.5,
      "avg": 10.2,
      "max": 45.0,
      "data": [
        {"timestamp": "2026-03-17T10:00:00Z", "value": 12.5},
        {"timestamp": "2026-03-17T10:05:00Z", "value": 11.8}
      ]
    },
    "memory_percent": {
      "current": 68.3,
      "avg": 65.1,
      "max": 72.0
    },
    "disk_percent": {
      "current": 45.2,
      "avg": 44.8,
      "max": 45.2
    },
    "connections_active": {
      "current": 15,
      "avg": 12,
      "max": 25
    },
    "connections_idle": {
      "current": 5,
      "avg": 8,
      "max": 12
    },
    "cache_hit_ratio": {
      "current": 99.2,
      "avg": 98.8,
      "max": 99.5
    },
    "transactions_per_second": {
      "current": 150,
      "avg": 120,
      "max": 450
    },
    "rows_fetched_per_second": {
      "current": 1200,
      "avg": 980,
      "max": 3500
    },
    "rows_inserted_per_second": {
      "current": 50,
      "avg": 45,
      "max": 200
    }
  }
}

Available Metrics

System Metrics

MetricDescriptionUnit
cpu_percentCPU utilization%
memory_percentMemory utilization%
disk_percentDisk space used%
disk_read_iopsDisk read operationsops/s
disk_write_iopsDisk write operationsops/s

Connection Metrics

MetricDescriptionUnit
connections_activeActive connectionscount
connections_idleIdle connectionscount
connections_waitingWaiting connectionscount
connections_totalTotal connectionscount

Database Metrics

MetricDescriptionUnit
cache_hit_ratioBuffer cache hit ratio%
transactions_per_secondTransaction throughputtps
rows_fetched_per_secondRows readrows/s
rows_inserted_per_secondRows insertedrows/s
rows_updated_per_secondRows updatedrows/s
rows_deleted_per_secondRows deletedrows/s

Replication Metrics

MetricDescriptionUnit
replication_lag_bytesReplication lagbytes
replication_lag_secondsReplication delayseconds

Health Indicators

⚠️

Monitor these thresholds for optimal performance.

MetricWarningCritical
CPU70%+90%+
Memory80%+95%+
Disk80%+90%+
Cache hit ratiobelow 95%below 90%
Connections80%+ of max95%+ of max

Alerting

Set up alerts for critical metrics:

POST /v1/databases/{id}/alerts

curl -X POST https://api.cloudheed.com/v1/databases/db-abc123/alerts \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "cpu_percent",
    "condition": "greater_than",
    "threshold": 80,
    "duration": "5m",
    "channels": ["email", "slack"]
  }'

Best Practices

  1. Monitor cache hit ratio - Should be >95% for optimal performance
  2. Watch connection counts - Implement connection pooling if needed
  3. Set up alerts - Get notified before issues become critical
  4. Track disk growth - Plan capacity before running out of space
  5. Review trends - Look at 7-day trends, not just current values