Context
We have an existing Hetzner Ubuntu server running Claude Code for AI-assisted development work. Currently it's manually managed. We want the ops repo to handle provisioning, configuration, and ongoing management — similar to how we manage our DigitalOcean K8s cluster.
The server's role is to:
- Run Claude Code sessions (using a Claude subscription, not API calls)
- Poll GitHub for issues/PRs labeled
groundskeeper-autofix and work on them autonomously
- Run the Discord bot with full code review/fix capabilities (see below)
- Be SSH-accessible for manual use alongside automation
What needs to happen
1. Terraform stack (terraform/stacks/hetzner/)
Import the existing server into Terraform state and manage:
- Server resource (
hcloud_server) with lifecycle { ignore_changes = [user_data, image] } to prevent accidental replacement
- Firewall rules (
hcloud_firewall) — SSH access, any other needed ports
- SSH keys (
hcloud_ssh_key)
- DNS records if needed
- Hetzner API token stored in 1Password, pulled via the 1Password Terraform provider (consistent with existing pattern)
Gotchas to watch for:
- Changing
user_data or image on an imported server forces replacement — must use ignore_changes
- Use
delete_protection = true on the server resource
- Pin the
hetznercloud/hcloud provider version
2. Server configuration management
Need to install and configure: git, gh CLI, node, pnpm, Claude Code CLI, clone repos (longterm-wiki, ops), configure git auth with QURI Bot account.
Tool options (in order of recommendation)
Option A: Ansible (recommended)
- Industry-standard pairing with Terraform: "Terraform provisions, Ansible configures"
- Native 1Password integration via
community.general.onepassword_info lookup plugin or op run
- Idempotent re-runs — safe to run repeatedly to update configuration
- Playbook would be ~100-200 lines for our use case
- Supports ongoing config management, not just one-time setup
- Moderate learning curve but extremely well-documented
- An
ansible/ directory in the ops repo with a playbook and inventory generated from Terraform outputs
Option B: Shell script (scripts/setup-hetzner.sh)
- Simplest approach, no new tools
- Run via
op run -- ssh user@server 'bash -s' < scripts/setup-hetzner.sh
- Must manually write idempotency guards (check if things exist before installing)
- Good enough for initial setup, fragile for ongoing management
- Risk: becomes unmaintainable as requirements grow
Option C: Pyinfra
- Python-based alternative to Ansible — faster, real Python instead of YAML
- Built-in idempotent operations
- Smaller community and ecosystem than Ansible
- No native 1Password module (would use
op CLI)
- Good middle ground if Ansible feels heavyweight
Not recommended:
- Terraform provisioners (remote-exec) — HashiCorp themselves say to avoid these; not tracked in state, no re-run capability
- Packer — designed for immutable images, poor fit for a mutable dev server people SSH into
- NixOS — powerful but very steep learning curve, requires OS change from Ubuntu
- Cloud-init alone — first-boot only, no ongoing management capability
3. Secrets management
Secrets needed on the server:
- GitHub PAT (QURI Bot account) for
gh auth and git operations
- Claude Code authentication (subscription login or API key)
- SSH keys for repo access
- Any wiki-server API keys needed by the polling daemon
- Discord bot token (for running the Discord bot)
- Claude Code OAuth token (for /ask command via Agent SDK)
Approach: Store secrets in 1Password (consistent with existing infra). Either:
- Install
op CLI on the server so it can pull secrets at runtime
- Inject secrets via Ansible from 1Password during playbook runs
- Use
op run when running setup scripts to pass secrets as env vars
4. Polling daemon
A service on the Hetzner server that replaces the groundskeeper's issue-responder:
- Polls GitHub for issues/PRs with
groundskeeper-autofix label (or /groundskeeper comments)
- Spawns Claude Code sessions with full repo access and shell
- Reports results back to GitHub (comments, commits, PR updates)
- Runs as a systemd service for reliability
- The daemon code itself should live in the longterm-wiki repo; ops manages its deployment and configuration
5. Discord bot on Hetzner
Migrate the Discord bot from K8s to Hetzner to unlock full code capabilities:
Currently the Discord bot runs in K8s with limited capabilities — it can answer wiki Q&A questions (@mention) and do read-only research (/ask), but can't review or fix code because the K8s pod lacks terminal access, git, and full repo checkouts.
Running on Hetzner would enable:
- Code review: Bot can read full source, run linters/tests, provide substantive PR reviews
- Code fixes: Bot can edit files, create branches, open PRs in response to Discord requests
- Full Claude Code access: Agent SDK with Bash, Edit, Write tools — not just Read/Glob/Grep
- Persistent repo state: Git repos stay cloned and up-to-date, no init containers needed
- Terminal access: Can run builds, tests, type-checks as part of answering questions
Implementation:
- Run the Discord bot as a systemd service alongside the polling daemon
- Full repo checkouts at a known path (e.g.,
/home/bot/repos/longterm-wiki)
WIKI_REPO_PATH points to the actual repo instead of a stripped-down content copy
- Bot has access to
gh CLI for creating PRs, commenting on issues
- Can share the Claude Code OAuth token with the polling daemon
This effectively consolidates the groundskeeper, issue-responder, and Discord bot into one well-provisioned server.
6. Monitoring
- Groundskeeper health check or simple uptime monitor that pings the Hetzner server
- Alert via Discord webhook if the server is unreachable
- Could be as simple as adding the server to the groundskeeper's health-check task
Architecture diagram
┌─────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ GitHub │◄────│ Hetzner │────►│ Wiki Server │
│ (issues, │ │ Server │ │ (K8s pod) │
│ PRs, │ │ │ │ │
│ labels) │ │ - Claude Code │ │ - Report results│
│ │ │ - Poll daemon │ │ - Agent sessions│
│ │ │ - Discord bot │ │ │
└─────────────┘ │ - Full repos │ └─────────────────┘
│ - Shell/git/gh │
┌─────────┐ └──────────────────┘
│ Discord │ ▲ ▲
│ (users) │────────┘ │ SSH
└─────────┘ │
┌─────┴─────┐
│ Operator │
│ (manual) │
└───────────┘
Open questions
- Should the groundskeeper's issue-responder be disabled once the Hetzner polling daemon is running? Or should they coexist (groundskeeper for simple tasks, Hetzner for complex ones)?
- Should the Hetzner server also run the groundskeeper itself (replacing the K8s pod)?
- Do we want multiple repos cloned, or just longterm-wiki?
- What's the budget/size constraint for the Hetzner server? (affects
server_type choice)
- Should the Discord bot be fully migrated to Hetzner, or should we keep a lightweight K8s version for basic Q&A and only delegate code tasks to Hetzner?
Context
We have an existing Hetzner Ubuntu server running Claude Code for AI-assisted development work. Currently it's manually managed. We want the ops repo to handle provisioning, configuration, and ongoing management — similar to how we manage our DigitalOcean K8s cluster.
The server's role is to:
groundskeeper-autofixand work on them autonomouslyWhat needs to happen
1. Terraform stack (
terraform/stacks/hetzner/)Import the existing server into Terraform state and manage:
hcloud_server) withlifecycle { ignore_changes = [user_data, image] }to prevent accidental replacementhcloud_firewall) — SSH access, any other needed portshcloud_ssh_key)Gotchas to watch for:
user_dataorimageon an imported server forces replacement — must useignore_changesdelete_protection = trueon the server resourcehetznercloud/hcloudprovider version2. Server configuration management
Need to install and configure: git, gh CLI, node, pnpm, Claude Code CLI, clone repos (longterm-wiki, ops), configure git auth with QURI Bot account.
Tool options (in order of recommendation)
Option A: Ansible (recommended)
community.general.onepassword_infolookup plugin orop runansible/directory in the ops repo with a playbook and inventory generated from Terraform outputsOption B: Shell script (
scripts/setup-hetzner.sh)op run -- ssh user@server 'bash -s' < scripts/setup-hetzner.shOption C: Pyinfra
opCLI)Not recommended:
3. Secrets management
Secrets needed on the server:
gh authand git operationsApproach: Store secrets in 1Password (consistent with existing infra). Either:
opCLI on the server so it can pull secrets at runtimeop runwhen running setup scripts to pass secrets as env vars4. Polling daemon
A service on the Hetzner server that replaces the groundskeeper's issue-responder:
groundskeeper-autofixlabel (or/groundskeepercomments)5. Discord bot on Hetzner
Migrate the Discord bot from K8s to Hetzner to unlock full code capabilities:
Currently the Discord bot runs in K8s with limited capabilities — it can answer wiki Q&A questions (@mention) and do read-only research (/ask), but can't review or fix code because the K8s pod lacks terminal access, git, and full repo checkouts.
Running on Hetzner would enable:
Implementation:
/home/bot/repos/longterm-wiki)WIKI_REPO_PATHpoints to the actual repo instead of a stripped-down content copyghCLI for creating PRs, commenting on issuesThis effectively consolidates the groundskeeper, issue-responder, and Discord bot into one well-provisioned server.
6. Monitoring
Architecture diagram
Open questions
server_typechoice)