> ## Documentation Index > Fetch the complete documentation index at: https://docs.stackshift.cloud/llms.txt > Use this file to discover all available pages before exploring further. # Stack troubleshooting > Common stack-side failures around placement, logs, health, template drift, and restore behavior. **Live.** This area is documented as current, user-reliable behavior. ## Goal Troubleshoot stacks as service systems, not just as single containers. ## Prerequisites * An existing stack ## Workflow Start with the stack detail page, health state, and recent logs. Check placement and node health when the stack never stabilizes. Use recovery-state messaging for backup, restore, or template-upgrade issues. ## Runtime vs placement * A stack that crashes or restarts repeatedly is usually a runtime problem — read the logs and per-service health first. * A stack that never stabilizes is often placement: pinned to an unhealthy node, or least\_loaded with no node matching its selector tags. ## Template drift and upgrades Template-backed stacks report an upgrade status: up\_to\_date, update\_available, upgrade\_blocked, or unknown. An upgrade\_blocked status means the upgrade cannot apply safely as-is — preview the upgrade before applying, and remember a template upgrade can be rolled back. ## Backup and restore failures * A failed volume archive leaves the backup incomplete — only restore from a completed backup. * An agent that is too old for the volume endpoints will fail backup or restore; upgrade the node agent. * Watch recovery-state messaging through a restore instead of assuming the status badge alone means success. ## Expected result You can tell whether the problem is runtime, placement, recovery, or template-related. ## Common failures * No healthy node matches the stack selector tags, so placement never lands. * A template upgrade reports upgrade\_blocked and cannot apply without changes. * A volume archive failed, so the backup is not safe to restore from. ## Related guides Use the stack detail, logs, and placement information to understand how the stack is actually running. Use S3-backed named-volume archives to protect and recover stateful stack data. Read the operation state on a resource — its status, current step, attempt count, retryable flag, and last error — together with logs, instead of treating a single “error” badge as the whole story.