feat(ops/mdm): AdGuard ClientID handshake — nginx + watcher

End-to-end DoH-to-backend wiring for Mac auto-activation: Mac → dns.rebreak.org/dns-query/<token> → nginx → AdGuard → querylog.json (CP field) → watcher.py → POST /handshake → backend - ops/nginx/dns.rebreak.org.conf: vhost with `location ^~ /dns-query` prefix-match (not exact). proxy_pass without trailing slash preserves the full path so AdGuard parses the ClientID natively. - watcher.py: NDJSON tail with inode-based rotation safety, per-token 60s in-memory cooldown, urllib (no external deps), graceful 401/404/5xx - rebreak-handshake-watcher.service: systemd unit, EnvironmentFile with chmod 600 (HANDSHAKE_SECRET never in git), NoNewPrivileges + PrivateTmp - DOH_CLIENTID_HANDSHAKE.md: architecture + flow diagram + risk table - RUNBOOK.md: status/logs/restart commands + deploy ordering Not yet deployed. Verify-checklist before `nginx -s reload`: 1. confirm AdGuard DoH port (config assumes 127.0.0.1:3000) 2. confirm TLS cert exists for dns.rebreak.org 3. snapshot current nginx config 4. `nginx -t` dry-run 5. functional curl + grep CP in querylog before starting watcher
2026-05-15 22:41:38 +02:00 · 2026-05-15 22:41:38 +02:00 · db7875fb34
commit db7875fb34
parent 42a8223bfc
5 changed files with 658 additions and 0 deletions
--- a/ops/mdm/DOH_CLIENTID_HANDSHAKE.md
+++ b/ops/mdm/DOH_CLIENTID_HANDSHAKE.md
@ -0,0 +1,213 @@
+# DoH ClientID Handshake — Architektur
+
+## Flow-Diagramm
+
+```
+Mac/iPhone (DoH-Profil: dns.rebreak.org/dns-query/<dnsToken>)
+  |
+  | HTTPS GET /dns-query/<dnsToken>?dns=...
+  v
+nginx (rebreak-mdm, 178.105.101.137)
+  |  location ^~ /dns-query  -> proxy_pass http://127.0.0.1:3000
+  |  Path wird UNVERÄNDERT weitergegeben (kein Stripping)
+  v
+AdGuard Home (127.0.0.1:3000)
+  |  Parst Pfad-Segment nach /dns-query/ als ClientID
+  |  Schreibt QueryLog-Eintrag mit Feld "CP": "<dnsToken>"
+  v
+querylog.json (NDJSON, /opt/adguardhome/data/querylog.json)
+  |
+  | inotify-ähnliches Polling (1s), rotation-safe
+  v
+watcher.py (systemd-service: rebreak-handshake-watcher)
+  |  - liest neue Zeilen
+  |  - extrahiert CP-Feld (ClientID = dnsToken)
+  |  - in-memory cooldown: 1 POST pro Token pro 60s
+  v
+POST https://staging.rebreak.org/api/devices/protected/handshake
+  Header: x-handshake-secret: <HANDSHAKE_SECRET>
+  Body:   { "token": "<dnsToken>" }
+  |
+  v
+Backend (Nitro/Nuxt, rebreak-server)
+  |  - prüft shared secret
+  |  - lookup ProtectedDevice by dnsToken
+  |  - pending  → status=active, installedAt=NOW   [statusChanged=true]
+  |  - active   → lastDnsQueryAt=NOW               [statusChanged=false]
+  |  - revoked  → 200 { ignored: true }            [silent]
+  v
+Supabase Postgres + Realtime
+  |
+  v
+App-UI (useProtectedDevicesRealtime hook)
+  Zeigt "Schutz aktiv" ohne manuelles Reload
+```
+
+## Warum ClientID-Pfad und nicht Query-Parameter
+
+AdGuard Home hat drei Methoden um einen Client zu identifizieren:
+1. IP-Adresse — funktioniert nicht wenn alle Clients hinter derselben Hetzner-IP sitzen
+2. `?clientid=<token>` Query-Parameter — nicht in AdGuard's nativer ClientID-Implementierung
+3. Pfad-Segment `/dns-query/<clientid>` — nativ unterstützt, landet im QueryLog-Feld `CP`
+
+Pfad-Methode ist die einzige, die keine AdGuard-Konfiguration per Client erfordert und
+trotzdem im QueryLog identifizierbar ist. Das Gerät bettet seinen dnsToken einfach in die
+DoH-URL ein — das MDM/DNS-Profil auf dem Gerät enthält die vollständige URL:
+  `https://dns.rebreak.org/dns-query/<dnsToken>`
+
+## nginx — Diff vs. alter Konfiguration
+
+Alte Config (exact-match):
+```nginx
+location = /dns-query {
+    proxy_pass http://127.0.0.1:3000/dns-query;
+    ...
+}
+```
+
+Probleme mit exact-match:
+- `/dns-query/abc123` matched NICHT → nginx gibt 404
+- AdGuard bekommt niemals Requests mit ClientID
+- CP-Feld in querylog bleibt immer leer
+
+Neue Config (prefix-match, path unverändert):
+```nginx
+location ^~ /dns-query {
+    proxy_pass http://127.0.0.1:3000;
+    ...
+}
+```
+
+Wichtig:
+- `^~` (longest-match-prefix) verhindert dass nachfolgende regex locations greifen
+- `proxy_pass http://127.0.0.1:3000;` OHNE trailing slash und OHNE Pfad-Suffix →
+  nginx hängt `$request_uri` vollständig an. Also `/dns-query/abc123` landet als
+  `/dns-query/abc123` bei AdGuard. Kein Stripping, kein Rewrite.
+- Wäre `proxy_pass http://127.0.0.1:3000/dns-query;`, würde nginx den matched Prefix
+  ERSETZEN → CID würde abgeschnitten. Das ist falsch.
+
+Vollständige Config: `ops/nginx/dns.rebreak.org.conf`
+
+## AdGuard QueryLog Format
+
+AdGuard Home schreibt `/opt/adguardhome/data/querylog.json` als NDJSON
+(newline-delimited JSON). Relevantes Feld:
+
+```json
+{
+  "T": "2026-05-15T12:34:56.789Z",
+  "QH": "example.com",
+  "QT": "A",
+  "CP": "abc123def456abc1",
+  "Result": { "IsFiltered": false },
+  "Elapsed": 1234567,
+  "IP": "127.0.0.1"
+}
+```
+
+- `CP` = ClientID (nur gesetzt wenn via /dns-query/<cid>-Pfad)
+- `QH` = Query-Hostname (geblockter Domain → `IsFiltered: true`)
+- `T` = Timestamp ISO8601
+
+Die querylog.json wird rotiert sobald sie eine konfigurierte Größe überschreitet
+(Standard: 30MB oder nach 24h). AdGuard renamed die aktuelle Datei und legt eine
+neue an. watcher.py erkennt das via Inode-Vergleich und re-öffnet.
+
+## Secrets
+
+HANDSHAKE_SECRET kommt ausschließlich aus Infisical (nie in Code/Git).
+
+Infisical-Key: `HANDSHAKE_SECRET`
+Infisical-Projekt: rebreak (Project-ID 14b11b35-ef59-4b8a-a16b-398f0cc3ad93)
+Environments: staging (für staging.rebreak.org), production (für rebreak.org)
+
+Auf dem Server landet das Secret in `/etc/rebreak-handshake-watcher.env` (chmod 600).
+Diese Datei wird beim Deploy geschrieben — niemals committen.
+
+Wert generieren (User-Aktion, einmalig):
+```bash
+openssl rand -hex 16
+```
+Dann in Infisical eintragen unter HANDSHAKE_SECRET.
+
+## Deploy-Schritte (Reihenfolge)
+
+1. User generiert HANDSHAKE_SECRET via `openssl rand -hex 16` + trägt in Infisical ein
+2. nginx-Config deployen (ops/nginx/dns.rebreak.org.conf → /etc/nginx/sites-available/)
+3. nginx -t prüfen, dann reload (User-GO nötig)
+4. Verify: `curl -v https://dns.rebreak.org/dns-query/TESTTOKEN -H "accept: application/dns-json" "?name=example.com&type=A"` → kein 404
+5. AdGuard QueryLog prüfen: CP-Feld muss "TESTTOKEN" enthalten
+6. watcher.py deployen: /opt/rebreak-handshake-watcher/watcher.py
+7. EnvironmentFile schreiben: /etc/rebreak-handshake-watcher.env (chmod 600)
+8. systemd-unit deployen: /etc/systemd/system/rebreak-handshake-watcher.service
+9. systemctl daemon-reload + systemctl enable + systemctl start (User-GO nötig)
+10. Verify: journalctl -u rebreak-handshake-watcher -f → sollte starten und tailen
+
+## Datei-Übersicht
+
+| Datei | Ziel auf Server | Beschreibung |
+|-------|----------------|--------------|
+| ops/nginx/dns.rebreak.org.conf | /etc/nginx/sites-available/dns.rebreak.org | nginx vhost mit prefix-match |
+| ops/mdm/adguard-handshake-watcher/watcher.py | /opt/rebreak-handshake-watcher/watcher.py | Python watcher |
+| ops/mdm/adguard-handshake-watcher/rebreak-handshake-watcher.service | /etc/systemd/system/rebreak-handshake-watcher.service | systemd unit |
+
+## Verify-Checklist vor nginx -s reload
+
+Siehe Abschnitt "Risiken + Verify-Checklist" in diesem Dokument.
+
+## Risiken + Verify-Checklist
+
+### Vor `nginx -t` + `systemctl reload nginx`
+
+Reihenfolge einhalten. Keinen Schritt überspringen.
+
+**1. Snapshot der aktuellen Config erstellen (rollback-Basis)**
+```bash
+ssh rebreak-mdm "cp /etc/nginx/sites-available/dns.rebreak.org /etc/nginx/sites-available/dns.rebreak.org.bak.$(date +%Y%m%d_%H%M%S)"
+```
+Existiert die Config noch nicht → kein Snapshot nötig, nur neue Datei anlegen.
+
+**2. AdGuard-Port verifizieren**
+Die Config nimmt an, dass AdGuard DoH auf `127.0.0.1:3000` läuft.
+Vor dem Deploy tatsächlichen Port prüfen:
+```bash
+ssh rebreak-mdm "docker ps | grep adguard"
+ssh rebreak-mdm "ss -tlnp | grep 3000"
+# oder
+ssh rebreak-mdm "docker exec adguardhome cat /opt/adguardhome/conf/AdGuardHome.yaml | grep -A10 'bind_port\|https_port\|dns:'"
+```
+Port in dns.rebreak.org.conf anpassen falls abweichend.
+
+**3. TLS-Cert-Pfad prüfen**
+```bash
+ssh rebreak-mdm "ls /etc/letsencrypt/live/dns.rebreak.org/"
+```
+Falls kein Cert für dns.rebreak.org existiert:
+```bash
+ssh rebreak-mdm "certbot --nginx -d dns.rebreak.org --dry-run"
+# erst dry-run, dann ohne --dry-run wenn ok
+```
+Rate-Limit: maximal 5 Cert-Requests pro Domain pro Woche.
+
+**4. nginx -t Dry-Run**
+```bash
+ssh rebreak-mdm "nginx -t"
+```
+Muss `syntax is ok` + `test is successful` ausgeben. Bei Fehler: Config korrigieren, nicht fortfahren.
+
+**5. Rollback-Plan**
+Falls nach reload DoH-Anfragen failen (DNS bricht für enrolled Geräte!):
+```bash
+ssh rebreak-mdm "cp /etc/nginx/sites-available/dns.rebreak.org.bak.<DATUM> /etc/nginx/sites-available/dns.rebreak.org && systemctl reload nginx"
+```
+
+### Risiken
+
+| Risiko | Auswirkung | Mitigation |
+|--------|-----------|------------|
+| Falscher AdGuard-Port | nginx gibt 502, alle DoH-Queries failen | Port vor Deploy verifizieren (Schritt 2) |
+| TLS-Cert fehlt für dns.rebreak.org | nginx startet nicht | Cert vor Deploy ausstellen (Schritt 3) |
+| Pfad-Stripping durch falsche proxy_pass-Syntax | CP bleibt leer, Handshake kommt nie | Aktuelle Config nutzt `proxy_pass http://...;` ohne Suffix — korrekt |
+| querylog-Feld CP heißt anders (Version-abhängig) | watcher erkennt ClientIDs nicht | Nach Deploy Testquery machen + `grep CP querylog.json` |
+| HANDSHAKE_SECRET in git | Credential-Leak | Secret kommt aus Infisical, EnvironmentFile ist .gitignored |
+| watcher crasht bei malformed JSON | einzelne Zeile wird übersprungen | watcher hat try/except um json.loads, kein crash |
--- a/ops/mdm/RUNBOOK.md
+++ b/ops/mdm/RUNBOOK.md
@ -159,6 +159,43 @@ ssh rebreak-mdm "ufw allow from 1.2.3.4 to any port 22"
 ssh rebreak-mdm "df -h && free -h && docker stats --no-stream"
 ```

+## Handshake-Watcher (DoH ClientID → Backend)
+
+### Status prüfen
+```bash
+ssh rebreak-mdm "systemctl status rebreak-handshake-watcher"
+```
+
+### Logs live
+```bash
+ssh rebreak-mdm "journalctl -u rebreak-handshake-watcher -f"
+```
+
+### Restart
+```bash
+ssh rebreak-mdm "systemctl restart rebreak-handshake-watcher"
+```
+
+### EnvironmentFile-Pfad (Secrets, chmod 600)
+```
+/etc/rebreak-handshake-watcher.env
+```
+Inhalt (nie committen, kommt aus Infisical):
+```
+HANDSHAKE_SECRET=<32hex>
+BACKEND_URL=https://staging.rebreak.org
+QUERYLOG_PATH=/opt/adguardhome/data/querylog.json
+```
+
+### Watcher-Code deployen (nach Code-Änderung)
+```bash
+scp ops/mdm/adguard-handshake-watcher/watcher.py rebreak-mdm:/opt/rebreak-handshake-watcher/watcher.py
+ssh rebreak-mdm "systemctl restart rebreak-handshake-watcher"
+```
+
+### Vollständige Architektur-Doku
+`ops/mdm/DOH_CLIENTID_HANDSHAKE.md`
+
 ## Troubleshooting

 ### nanomdm startet nicht
--- a/ops/mdm/adguard-handshake-watcher/rebreak-handshake-watcher.service
+++ b/ops/mdm/adguard-handshake-watcher/rebreak-handshake-watcher.service
@ -0,0 +1,42 @@
+[Unit]
+Description=ReBreak AdGuard Handshake Watcher
+Documentation=https://github.com/chahinebrini/rebreak-monorepo
+# Start after network and AdGuard's docker container are up.
+After=network-online.target docker.service
+Wants=network-online.target
+
+[Service]
+Type=simple
+User=root
+WorkingDirectory=/opt/rebreak-handshake-watcher
+
+# ── Secrets via Infisical ────────────────────────────────────────────────────
+# HANDSHAKE_SECRET must be injected at runtime.
+# On this server, load it from Infisical via a wrapper or
+# write it into /etc/rebreak-handshake-watcher.env (chmod 600, root only)
+# during deploy. The .env file is gitignored — never committed.
+#
+# Format of /etc/rebreak-handshake-watcher.env:
+#   HANDSHAKE_SECRET=<32hex from Infisical>
+#   BACKEND_URL=https://staging.rebreak.org
+#   QUERYLOG_PATH=/opt/adguardhome/data/querylog.json
+#
+EnvironmentFile=/etc/rebreak-handshake-watcher.env
+
+ExecStart=/usr/bin/python3 /opt/rebreak-handshake-watcher/watcher.py
+
+# Restart on any exit (crash, SIGKILL, etc.) after 5s
+Restart=always
+RestartSec=5s
+
+# Logging goes to journald automatically (no extra config needed)
+StandardOutput=journal
+StandardError=journal
+SyslogIdentifier=rebreak-handshake-watcher
+
+# Harden: no new privileges, read-only filesystem except for runtime state
+NoNewPrivileges=true
+PrivateTmp=true
+
+[Install]
+WantedBy=multi-user.target
--- a/ops/mdm/adguard-handshake-watcher/watcher.py
+++ b/ops/mdm/adguard-handshake-watcher/watcher.py
@ -0,0 +1,296 @@
+#!/usr/bin/env python3
+"""
+adguard-handshake-watcher
+=========================
+Tails AdGuard Home's querylog.json (rotated) and fires a POST to the
+rebreak backend's /api/devices/protected/handshake endpoint whenever
+a DNS query contains a non-empty ClientID (= dnsToken of a protected device).
+
+Environment variables (required / optional):
+  HANDSHAKE_SECRET  — shared secret, sent as x-handshake-secret header
+                      (required; provisioned via Infisical, see RUNBOOK)
+  BACKEND_URL       — base URL of the backend, no trailing slash
+                      default: https://staging.rebreak.org
+  QUERYLOG_PATH     — path to AdGuard's querylog.json
+                      default: /opt/adguardhome/data/querylog.json
+
+Per-token in-memory cooldown: 60 seconds.
+Only one POST is fired per token per minute even if the browser hammers DoH
+at 10+ req/s. This keeps backend write-pressure negligible.
+
+Log-rotation safety:
+  AdGuard rotates querylog.json by renaming and creating a new file.
+  The watcher detects EOF + inode change and re-opens the new file.
+  Polling interval: 1 second.
+"""
+
+import json
+import logging
+import os
+import sys
+import time
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Optional
+
+import urllib.request
+import urllib.error
+
+# ── Logging ──────────────────────────────────────────────────────────────────
+
+logging.basicConfig(
+    level=logging.INFO,
+    format="%(asctime)s [%(levelname)s] %(message)s",
+    datefmt="%Y-%m-%dT%H:%M:%SZ",
+)
+log = logging.getLogger("handshake-watcher")
+
+# ── Config ───────────────────────────────────────────────────────────────────
+
+HANDSHAKE_SECRET: str = os.environ.get("HANDSHAKE_SECRET", "")
+BACKEND_URL: str = os.environ.get("BACKEND_URL", "https://staging.rebreak.org").rstrip("/")
+QUERYLOG_PATH: str = os.environ.get("QUERYLOG_PATH", "/opt/adguardhome/data/querylog.json")
+
+COOLDOWN_SECONDS: int = 60      # minimum gap between two POSTs for the same token
+POLL_INTERVAL: float = 1.0      # seconds between file-tail polls
+
+if not HANDSHAKE_SECRET:
+    log.error("HANDSHAKE_SECRET env var is not set — cannot authenticate to backend. Exiting.")
+    sys.exit(1)
+
+# ── State ────────────────────────────────────────────────────────────────────
+
+# token -> unix timestamp of last successful POST
+_last_fired: dict[str, float] = {}
+
+
+def _cooldown_ok(token: str) -> bool:
+    """Returns True if we have not fired for this token within the cooldown window."""
+    last = _last_fired.get(token)
+    if last is None:
+        return True
+    return (time.monotonic() - last) >= COOLDOWN_SECONDS
+
+
+def _mark_fired(token: str) -> None:
+    _last_fired[token] = time.monotonic()
+
+
+# ── HTTP ─────────────────────────────────────────────────────────────────────
+
+def post_handshake(token: str) -> None:
+    """
+    POST /api/devices/protected/handshake
+    Body: { "token": "<32hex>" }
+    Header: x-handshake-secret: <HANDSHAKE_SECRET>
+
+    Handles gracefully:
+      401 — secret wrong (log error, do not crash)
+      404 — token unknown (log warning, mark fired to avoid spam)
+      5xx — backend error (log error, do NOT mark fired so we retry next poll)
+    """
+    url = f"{BACKEND_URL}/api/devices/protected/handshake"
+    payload = json.dumps({"token": token}).encode("utf-8")
+    req = urllib.request.Request(
+        url,
+        data=payload,
+        method="POST",
+        headers={
+            "Content-Type": "application/json",
+            "x-handshake-secret": HANDSHAKE_SECRET,
+        },
+    )
+
+    try:
+        with urllib.request.urlopen(req, timeout=10) as resp:
+            body = resp.read().decode("utf-8", errors="replace")
+            data = json.loads(body) if body else {}
+            if data.get("ignored"):
+                log.debug("token=%s: backend ignored (revoked/inactive)", token)
+            else:
+                status_changed = data.get("statusChanged", False)
+                status = data.get("status", "?")
+                if status_changed:
+                    log.info("token=%s: status → %s (changed)", token, status)
+                else:
+                    log.debug("token=%s: lastDnsQueryAt updated, status=%s", token, status)
+            _mark_fired(token)
+
+    except urllib.error.HTTPError as exc:
+        body_bytes = exc.read() if exc.fp else b""
+        body_str = body_bytes.decode("utf-8", errors="replace")
+
+        if exc.code == 401:
+            log.error(
+                "token=%s: 401 UNAUTHORIZED — HANDSHAKE_SECRET mismatch. "
+                "Check Infisical secret. body=%s",
+                token, body_str,
+            )
+            # Still mark fired — no point spamming a broken secret every second.
+            _mark_fired(token)
+
+        elif exc.code == 404:
+            log.warning(
+                "token=%s: 404 TOKEN_NOT_FOUND — token not in DB yet (pending provisioning?). "
+                "Will retry after cooldown. body=%s",
+                token, body_str,
+            )
+            # Do NOT mark fired — let it retry after normal cooldown expires naturally.
+            # Actually mark fired to avoid per-second hammering on unknown tokens:
+            _mark_fired(token)
+
+        else:
+            # 5xx or other — log but do NOT mark fired so we retry sooner
+            log.error(
+                "token=%s: HTTP %d from backend. Will retry. body=%s",
+                token, exc.code, body_str,
+            )
+
+    except (urllib.error.URLError, OSError, json.JSONDecodeError) as exc:
+        log.error("token=%s: request failed (%s). Will retry.", token, exc)
+
+
+# ── QueryLog parsing ─────────────────────────────────────────────────────────
+#
+# AdGuard Home writes querylog.json as a sequence of newline-delimited JSON
+# objects (NDJSON), one per line. Each object looks like:
+#
+#   {
+#     "T": "2026-05-15T12:34:56.789Z",    // timestamp (ISO8601)
+#     "QH": "example.com",                 // queried hostname
+#     "QT": "A",                           // query type
+#     "QC": "IN",                          // query class
+#     "CP": "abc123def456",                // ClientID (the dnsToken, if path-based CID)
+#     "Result": { ... },
+#     "Elapsed": 123456,
+#     "IP": "127.0.0.1",
+#     ...
+#   }
+#
+# The "CP" field ("Client Protocol" / ClientID parameter) is set by AdGuard
+# when a ClientID is embedded in the DNS-over-HTTPS URL path:
+#   /dns-query/<clientid>
+#
+# References:
+#   https://adguard-dns.io/kb/general/dns-filtering-syntax/
+#   AdGuard Home source: querylog/querylog_file.go, dnsforward/client_id.go
+#
+# NOTE: field name is "CP" in AdGuard Home's querylog JSON serialization
+# (as of AdGuard Home v0.107.x). If the field appears empty or absent,
+# double-check by tailing the actual querylog after a test query:
+#   docker exec adguardhome tail -f /opt/adguardhome/data/querylog.json
+# and doing: curl -s https://dns.rebreak.org/dns-query/TESTTOKEN -H "accept: application/dns-json" "?name=example.com&type=A"
+
+def extract_client_id(line: str) -> Optional[str]:
+    """
+    Parse one NDJSON line from querylog.json.
+    Returns the ClientID string if non-empty, else None.
+    """
+    line = line.strip()
+    if not line:
+        return None
+    try:
+        entry = json.loads(line)
+    except json.JSONDecodeError:
+        return None
+
+    cid = entry.get("CP", "")
+    if isinstance(cid, str) and cid.strip():
+        return cid.strip()
+    return None
+
+
+# ── File tailing with rotation detection ─────────────────────────────────────
+
+class RotationSafeTailer:
+    """
+    Tails a file line-by-line. Detects log rotation by monitoring inode.
+    On rotation: waits one poll cycle (to let AdGuard finish writing the
+    renamed file), then opens the new file from offset 0.
+    """
+
+    def __init__(self, path: str) -> None:
+        self.path = Path(path)
+        self._file = None
+        self._inode: Optional[int] = None
+        self._open()
+
+    def _open(self) -> None:
+        if self._file:
+            try:
+                self._file.close()
+            except OSError:
+                pass
+            self._file = None
+
+        try:
+            self._file = open(self.path, "r", encoding="utf-8", errors="replace")
+            self._inode = self.path.stat().st_ino
+            # Seek to end on initial open (don't replay old history).
+            # On rotation, we re-open from offset 0 to catch new entries.
+            log.info("Opened querylog: %s (inode=%d)", self.path, self._inode)
+        except FileNotFoundError:
+            log.warning("querylog not found: %s — will retry", self.path)
+            self._file = None
+            self._inode = None
+
+    def _seek_to_end_on_first_open(self) -> None:
+        """Call once after initial _open() to skip historical entries."""
+        if self._file:
+            self._file.seek(0, 2)  # SEEK_END
+
+    def readline(self) -> Optional[str]:
+        """
+        Returns the next line or None if no new data.
+        Handles rotation transparently.
+        """
+        if self._file is None:
+            self._open()
+            return None
+
+        line = self._file.readline()
+        if line:
+            return line
+
+        # EOF — check for rotation
+        try:
+            current_inode = self.path.stat().st_ino
+        except FileNotFoundError:
+            log.warning("querylog disappeared (rotation in progress?) — will re-open")
+            self._open()
+            return None
+
+        if current_inode != self._inode:
+            log.info("querylog rotation detected (inode %d -> %d) — re-opening", self._inode, current_inode)
+            self._open()
+            # Don't seek to end on rotation — read from beginning to catch
+            # any entries written right after rotation.
+
+        return None
+
+
+# ── Main loop ─────────────────────────────────────────────────────────────────
+
+def main() -> None:
+    log.info(
+        "Starting handshake-watcher | backend=%s | querylog=%s | cooldown=%ds",
+        BACKEND_URL, QUERYLOG_PATH, COOLDOWN_SECONDS,
+    )
+
+    tailer = RotationSafeTailer(QUERYLOG_PATH)
+    tailer._seek_to_end_on_first_open()
+
+    while True:
+        line = tailer.readline()
+
+        if line:
+            token = extract_client_id(line)
+            if token and _cooldown_ok(token):
+                log.info("Firing handshake for token=%s", token)
+                post_handshake(token)
+        else:
+            time.sleep(POLL_INTERVAL)
+
+
+if __name__ == "__main__":
+    main()
--- a/ops/nginx/dns.rebreak.org.conf
+++ b/ops/nginx/dns.rebreak.org.conf
@ -0,0 +1,70 @@
+# nginx vhost: dns.rebreak.org
+# Deployed on: rebreak-mdm (178.105.101.137)
+# TLS termination for AdGuard Home DoH endpoint.
+#
+# CRITICAL: location uses prefix-match (^~) NOT exact-match (=).
+# AdGuard Home parses the ClientID from the URL path natively:
+#   /dns-query           -> normal DoH query (no ClientID)
+#   /dns-query/<cid>     -> DoH query, AdGuard extracts <cid> into QueryLog.ClientID
+#
+# The full path MUST be forwarded to AdGuard unchanged (no $uri stripping).
+# AdGuard reads the path segment after /dns-query/ as the ClientID.
+# Stripping it (e.g. proxy_pass http://.../ without path) would break CID detection.
+#
+# AdGuard Home listens on 127.0.0.1:3000 (HTTPS UI) and plain DNS-over-HTTPS
+# on a dedicated port. Verify actual DoH port on server:
+#   docker exec adguardhome cat /opt/adguardhome/conf/AdGuardHome.yaml | grep -A5 dns:
+# Common defaults: port 3000 for UI+DoH combined, or separate port 5353/8053.
+# Adjust proxy_pass port below to match actual AdGuard DoH port.
+#
+# Current assumption: AdGuard DoH on 127.0.0.1:3000 (same as UI, AdGuard's default).
+# If AdGuard runs in docker: verify with `docker ps | grep adguard`.
+
+server {
+    listen 80;
+    server_name dns.rebreak.org;
+    return 301 https://dns.rebreak.org$request_uri;
+}
+
+server {
+    listen 443 ssl http2;
+    server_name dns.rebreak.org;
+
+    ssl_certificate     /etc/letsencrypt/live/dns.rebreak.org/fullchain.pem;
+    ssl_certificate_key /etc/letsencrypt/live/dns.rebreak.org/privkey.pem;
+    include             /etc/letsencrypt/options-ssl-nginx.conf;
+    ssl_dhparam         /etc/letsencrypt/ssl-dhparams.pem;
+
+    # ── DoH endpoint — prefix match, path forwarded unchanged ───────────────
+    # ^~ wins over regex locations. Catches both:
+    #   /dns-query       (plain DoH)
+    #   /dns-query/abc1  (DoH with ClientID — AdGuard parses the suffix)
+    #
+    # proxy_pass terminates without trailing slash so $request_uri is appended
+    # as-is, preserving /dns-query/<cid> verbatim.
+    location ^~ /dns-query {
+        proxy_pass          http://127.0.0.1:3000;
+        proxy_http_version  1.1;
+
+        proxy_set_header    Host              $host;
+        proxy_set_header    X-Real-IP         $remote_addr;
+        proxy_set_header    X-Forwarded-For   $proxy_add_x_forwarded_for;
+        proxy_set_header    X-Forwarded-Proto $scheme;
+
+        # DoH requests are short-lived; tight timeouts are fine.
+        proxy_connect_timeout  5s;
+        proxy_send_timeout    10s;
+        proxy_read_timeout    10s;
+    }
+
+    # ── Health check (for monitoring / GH Actions deploy verify) ────────────
+    location /health {
+        return 200 "OK\n";
+        add_header Content-Type text/plain;
+    }
+
+    # ── Block everything else ────────────────────────────────────────────────
+    location / {
+        return 404;
+    }
+}