Skip to main content
Live. This area is documented as current, user-reliable behavior.

Goal

Diagnose managed database issues faster by separating provisioning, connectivity, and recovery failures.

Prerequisites

  • An existing database

Workflow

1
Confirm whether the issue is provisioning, runtime health, credentials, or restore-related.
2
Check recovery and error messaging before retrying.
3
Validate S3 backup configuration when backup or restore is involved.

Classify the failure first

  • Stuck in provisioning or failed: a creation/runtime problem — check the database status and logs.
  • Running but unreachable: a connectivity or credentials problem, not a database-down problem.
  • Restoring or cloning that never completes: a recovery problem — check backup storage and recovery messaging.

Connectivity

  • Re-open the credentials from the database surface rather than reusing a stale connection string.
  • Confirm whether the app should use the direct connection or the pooler host/port — pointing at the wrong one looks like an outage.
  • Check the metrics and logs to see whether connections are being refused or the database is saturated.

Backup and restore issues

When backup or restore is involved, confirm S3 backup storage is configured on the control plane and that you are restoring from a completed backup. Use backup testing to validate a backup is restorable before relying on it.

Expected result

Database failures are easier to classify and escalate correctly.

Common failures

  • App points at the direct connection when the pooler is expected (or vice versa).
  • Reusing stale credentials after a restore or change.
  • Backup or restore attempted without S3 storage configured.

Back up and restore a database

Use the current S3-backed database backup and restore model with the correct operational expectations.

Credentials, pooling, and usage expectations

Understand how to consume database connection details and what to assume about pooling and access patterns.

Recovery states, logs, and troubleshooting

Read the operation state on a resource — its status, current step, attempt count, retryable flag, and last error — together with logs, instead of treating a single “error” badge as the whole story.