Add Hetzner server provisioning and management to ops repo

## Context

We have an existing Hetzner Ubuntu server running Claude Code for AI-assisted development work. Currently it's manually managed. We want the ops repo to handle provisioning, configuration, and ongoing management — similar to how we manage our DigitalOcean K8s cluster.

The server's role is to:
- Run Claude Code sessions (using a Claude subscription, not API calls)
- Poll GitHub for issues/PRs labeled `groundskeeper-autofix` and work on them autonomously
- **Run the Discord bot with full code review/fix capabilities** (see below)
- Be SSH-accessible for manual use alongside automation

## What needs to happen

### 1. Terraform stack (`terraform/stacks/hetzner/`)

Import the existing server into Terraform state and manage:
- Server resource (`hcloud_server`) with `lifecycle { ignore_changes = [user_data, image] }` to prevent accidental replacement
- Firewall rules (`hcloud_firewall`) — SSH access, any other needed ports
- SSH keys (`hcloud_ssh_key`)
- DNS records if needed
- Hetzner API token stored in 1Password, pulled via the 1Password Terraform provider (consistent with existing pattern)

**Gotchas to watch for:**
- Changing `user_data` or `image` on an imported server forces replacement — must use `ignore_changes`
- Use `delete_protection = true` on the server resource
- Pin the `hetznercloud/hcloud` provider version

### 2. Server configuration management

Need to install and configure: git, gh CLI, node, pnpm, Claude Code CLI, clone repos (longterm-wiki, ops), configure git auth with QURI Bot account.

#### Tool options (in order of recommendation)

**Option A: Ansible (recommended)**
- Industry-standard pairing with Terraform: "Terraform provisions, Ansible configures"
- Native 1Password integration via `community.general.onepassword_info` lookup plugin or `op run`
- Idempotent re-runs — safe to run repeatedly to update configuration
- Playbook would be ~100-200 lines for our use case
- Supports ongoing config management, not just one-time setup
- Moderate learning curve but extremely well-documented
- An `ansible/` directory in the ops repo with a playbook and inventory generated from Terraform outputs

**Option B: Shell script (`scripts/setup-hetzner.sh`)**
- Simplest approach, no new tools
- Run via `op run -- ssh user@server 'bash -s' < scripts/setup-hetzner.sh`
- Must manually write idempotency guards (check if things exist before installing)
- Good enough for initial setup, fragile for ongoing management
- Risk: becomes unmaintainable as requirements grow

**Option C: Pyinfra**
- Python-based alternative to Ansible — faster, real Python instead of YAML
- Built-in idempotent operations
- Smaller community and ecosystem than Ansible
- No native 1Password module (would use `op` CLI)
- Good middle ground if Ansible feels heavyweight

**Not recommended:**
- *Terraform provisioners (remote-exec)* — HashiCorp themselves say to avoid these; not tracked in state, no re-run capability
- *Packer* — designed for immutable images, poor fit for a mutable dev server people SSH into
- *NixOS* — powerful but very steep learning curve, requires OS change from Ubuntu
- *Cloud-init alone* — first-boot only, no ongoing management capability

### 3. Secrets management

Secrets needed on the server:
- GitHub PAT (QURI Bot account) for `gh auth` and git operations
- Claude Code authentication (subscription login or API key)
- SSH keys for repo access
- Any wiki-server API keys needed by the polling daemon
- Discord bot token (for running the Discord bot)
- Claude Code OAuth token (for /ask command via Agent SDK)

**Approach:** Store secrets in 1Password (consistent with existing infra). Either:
- Install `op` CLI on the server so it can pull secrets at runtime
- Inject secrets via Ansible from 1Password during playbook runs
- Use `op run` when running setup scripts to pass secrets as env vars

### 4. Polling daemon

A service on the Hetzner server that replaces the groundskeeper's issue-responder:
- Polls GitHub for issues/PRs with `groundskeeper-autofix` label (or `/groundskeeper` comments)
- Spawns Claude Code sessions with full repo access and shell
- Reports results back to GitHub (comments, commits, PR updates)
- Runs as a systemd service for reliability
- The daemon code itself should live in the longterm-wiki repo; ops manages its deployment and configuration

### 5. Discord bot on Hetzner

**Migrate the Discord bot from K8s to Hetzner** to unlock full code capabilities:

Currently the Discord bot runs in K8s with limited capabilities — it can answer wiki Q&A questions (@mention) and do read-only research (/ask), but can't review or fix code because the K8s pod lacks terminal access, git, and full repo checkouts.

Running on Hetzner would enable:
- **Code review**: Bot can read full source, run linters/tests, provide substantive PR reviews
- **Code fixes**: Bot can edit files, create branches, open PRs in response to Discord requests
- **Full Claude Code access**: Agent SDK with Bash, Edit, Write tools — not just Read/Glob/Grep
- **Persistent repo state**: Git repos stay cloned and up-to-date, no init containers needed
- **Terminal access**: Can run builds, tests, type-checks as part of answering questions

Implementation:
- Run the Discord bot as a systemd service alongside the polling daemon
- Full repo checkouts at a known path (e.g., `/home/bot/repos/longterm-wiki`)
- `WIKI_REPO_PATH` points to the actual repo instead of a stripped-down content copy
- Bot has access to `gh` CLI for creating PRs, commenting on issues
- Can share the Claude Code OAuth token with the polling daemon

This effectively consolidates the groundskeeper, issue-responder, and Discord bot into one well-provisioned server.

### 6. Monitoring

- Groundskeeper health check or simple uptime monitor that pings the Hetzner server
- Alert via Discord webhook if the server is unreachable
- Could be as simple as adding the server to the groundskeeper's health-check task

## Architecture diagram

```
┌─────────────┐     ┌──────────────────┐     ┌─────────────────┐
│   GitHub     │◄────│   Hetzner        │────►│  Wiki Server    │
│  (issues,    │     │   Server         │     │  (K8s pod)      │
│   PRs,       │     │                  │     │                 │
│   labels)    │     │ - Claude Code    │     │ - Report results│
│              │     │ - Poll daemon    │     │ - Agent sessions│
│              │     │ - Discord bot    │     │                 │
└─────────────┘     │ - Full repos     │     └─────────────────┘
                    │ - Shell/git/gh   │
      ┌─────────┐  └──────────────────┘
      │ Discord  │        ▲      ▲
      │ (users)  │────────┘      │ SSH
      └─────────┘                │
                           ┌─────┴─────┐
                           │  Operator  │
                           │  (manual)  │
                           └───────────┘
```

## Open questions

- Should the groundskeeper's issue-responder be disabled once the Hetzner polling daemon is running? Or should they coexist (groundskeeper for simple tasks, Hetzner for complex ones)?
- Should the Hetzner server also run the groundskeeper itself (replacing the K8s pod)?
- Do we want multiple repos cloned, or just longterm-wiki?
- What's the budget/size constraint for the Hetzner server? (affects `server_type` choice)
- Should the Discord bot be fully migrated to Hetzner, or should we keep a lightweight K8s version for basic Q&A and only delegate code tasks to Hetzner?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Hetzner server provisioning and management to ops repo #51

Context

What needs to happen

1. Terraform stack (`terraform/stacks/hetzner/`)

2. Server configuration management

Tool options (in order of recommendation)

3. Secrets management

4. Polling daemon

5. Discord bot on Hetzner

6. Monitoring

Architecture diagram

Open questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add Hetzner server provisioning and management to ops repo #51

Description

Context

What needs to happen

1. Terraform stack (terraform/stacks/hetzner/)

2. Server configuration management

Tool options (in order of recommendation)

3. Secrets management

4. Polling daemon

5. Discord bot on Hetzner

6. Monitoring

Architecture diagram

Open questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. Terraform stack (`terraform/stacks/hetzner/`)