feat(ops/mdm): AdGuard ClientID handshake — nginx + watcher
End-to-end DoH-to-backend wiring for Mac auto-activation: Mac → dns.rebreak.org/dns-query/<token> → nginx → AdGuard → querylog.json (CP field) → watcher.py → POST /handshake → backend - ops/nginx/dns.rebreak.org.conf: vhost with `location ^~ /dns-query` prefix-match (not exact). proxy_pass without trailing slash preserves the full path so AdGuard parses the ClientID natively. - watcher.py: NDJSON tail with inode-based rotation safety, per-token 60s in-memory cooldown, urllib (no external deps), graceful 401/404/5xx - rebreak-handshake-watcher.service: systemd unit, EnvironmentFile with chmod 600 (HANDSHAKE_SECRET never in git), NoNewPrivileges + PrivateTmp - DOH_CLIENTID_HANDSHAKE.md: architecture + flow diagram + risk table - RUNBOOK.md: status/logs/restart commands + deploy ordering Not yet deployed. Verify-checklist before `nginx -s reload`: 1. confirm AdGuard DoH port (config assumes 127.0.0.1:3000) 2. confirm TLS cert exists for dns.rebreak.org 3. snapshot current nginx config 4. `nginx -t` dry-run 5. functional curl + grep CP in querylog before starting watcher
This commit is contained in:
parent
42a8223bfc
commit
db7875fb34
213
ops/mdm/DOH_CLIENTID_HANDSHAKE.md
Normal file
213
ops/mdm/DOH_CLIENTID_HANDSHAKE.md
Normal file
@ -0,0 +1,213 @@
|
||||
# DoH ClientID Handshake — Architektur
|
||||
|
||||
## Flow-Diagramm
|
||||
|
||||
```
|
||||
Mac/iPhone (DoH-Profil: dns.rebreak.org/dns-query/<dnsToken>)
|
||||
|
|
||||
| HTTPS GET /dns-query/<dnsToken>?dns=...
|
||||
v
|
||||
nginx (rebreak-mdm, 178.105.101.137)
|
||||
| location ^~ /dns-query -> proxy_pass http://127.0.0.1:3000
|
||||
| Path wird UNVERÄNDERT weitergegeben (kein Stripping)
|
||||
v
|
||||
AdGuard Home (127.0.0.1:3000)
|
||||
| Parst Pfad-Segment nach /dns-query/ als ClientID
|
||||
| Schreibt QueryLog-Eintrag mit Feld "CP": "<dnsToken>"
|
||||
v
|
||||
querylog.json (NDJSON, /opt/adguardhome/data/querylog.json)
|
||||
|
|
||||
| inotify-ähnliches Polling (1s), rotation-safe
|
||||
v
|
||||
watcher.py (systemd-service: rebreak-handshake-watcher)
|
||||
| - liest neue Zeilen
|
||||
| - extrahiert CP-Feld (ClientID = dnsToken)
|
||||
| - in-memory cooldown: 1 POST pro Token pro 60s
|
||||
v
|
||||
POST https://staging.rebreak.org/api/devices/protected/handshake
|
||||
Header: x-handshake-secret: <HANDSHAKE_SECRET>
|
||||
Body: { "token": "<dnsToken>" }
|
||||
|
|
||||
v
|
||||
Backend (Nitro/Nuxt, rebreak-server)
|
||||
| - prüft shared secret
|
||||
| - lookup ProtectedDevice by dnsToken
|
||||
| - pending → status=active, installedAt=NOW [statusChanged=true]
|
||||
| - active → lastDnsQueryAt=NOW [statusChanged=false]
|
||||
| - revoked → 200 { ignored: true } [silent]
|
||||
v
|
||||
Supabase Postgres + Realtime
|
||||
|
|
||||
v
|
||||
App-UI (useProtectedDevicesRealtime hook)
|
||||
Zeigt "Schutz aktiv" ohne manuelles Reload
|
||||
```
|
||||
|
||||
## Warum ClientID-Pfad und nicht Query-Parameter
|
||||
|
||||
AdGuard Home hat drei Methoden um einen Client zu identifizieren:
|
||||
1. IP-Adresse — funktioniert nicht wenn alle Clients hinter derselben Hetzner-IP sitzen
|
||||
2. `?clientid=<token>` Query-Parameter — nicht in AdGuard's nativer ClientID-Implementierung
|
||||
3. Pfad-Segment `/dns-query/<clientid>` — nativ unterstützt, landet im QueryLog-Feld `CP`
|
||||
|
||||
Pfad-Methode ist die einzige, die keine AdGuard-Konfiguration per Client erfordert und
|
||||
trotzdem im QueryLog identifizierbar ist. Das Gerät bettet seinen dnsToken einfach in die
|
||||
DoH-URL ein — das MDM/DNS-Profil auf dem Gerät enthält die vollständige URL:
|
||||
`https://dns.rebreak.org/dns-query/<dnsToken>`
|
||||
|
||||
## nginx — Diff vs. alter Konfiguration
|
||||
|
||||
Alte Config (exact-match):
|
||||
```nginx
|
||||
location = /dns-query {
|
||||
proxy_pass http://127.0.0.1:3000/dns-query;
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
Probleme mit exact-match:
|
||||
- `/dns-query/abc123` matched NICHT → nginx gibt 404
|
||||
- AdGuard bekommt niemals Requests mit ClientID
|
||||
- CP-Feld in querylog bleibt immer leer
|
||||
|
||||
Neue Config (prefix-match, path unverändert):
|
||||
```nginx
|
||||
location ^~ /dns-query {
|
||||
proxy_pass http://127.0.0.1:3000;
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
Wichtig:
|
||||
- `^~` (longest-match-prefix) verhindert dass nachfolgende regex locations greifen
|
||||
- `proxy_pass http://127.0.0.1:3000;` OHNE trailing slash und OHNE Pfad-Suffix →
|
||||
nginx hängt `$request_uri` vollständig an. Also `/dns-query/abc123` landet als
|
||||
`/dns-query/abc123` bei AdGuard. Kein Stripping, kein Rewrite.
|
||||
- Wäre `proxy_pass http://127.0.0.1:3000/dns-query;`, würde nginx den matched Prefix
|
||||
ERSETZEN → CID würde abgeschnitten. Das ist falsch.
|
||||
|
||||
Vollständige Config: `ops/nginx/dns.rebreak.org.conf`
|
||||
|
||||
## AdGuard QueryLog Format
|
||||
|
||||
AdGuard Home schreibt `/opt/adguardhome/data/querylog.json` als NDJSON
|
||||
(newline-delimited JSON). Relevantes Feld:
|
||||
|
||||
```json
|
||||
{
|
||||
"T": "2026-05-15T12:34:56.789Z",
|
||||
"QH": "example.com",
|
||||
"QT": "A",
|
||||
"CP": "abc123def456abc1",
|
||||
"Result": { "IsFiltered": false },
|
||||
"Elapsed": 1234567,
|
||||
"IP": "127.0.0.1"
|
||||
}
|
||||
```
|
||||
|
||||
- `CP` = ClientID (nur gesetzt wenn via /dns-query/<cid>-Pfad)
|
||||
- `QH` = Query-Hostname (geblockter Domain → `IsFiltered: true`)
|
||||
- `T` = Timestamp ISO8601
|
||||
|
||||
Die querylog.json wird rotiert sobald sie eine konfigurierte Größe überschreitet
|
||||
(Standard: 30MB oder nach 24h). AdGuard renamed die aktuelle Datei und legt eine
|
||||
neue an. watcher.py erkennt das via Inode-Vergleich und re-öffnet.
|
||||
|
||||
## Secrets
|
||||
|
||||
HANDSHAKE_SECRET kommt ausschließlich aus Infisical (nie in Code/Git).
|
||||
|
||||
Infisical-Key: `HANDSHAKE_SECRET`
|
||||
Infisical-Projekt: rebreak (Project-ID 14b11b35-ef59-4b8a-a16b-398f0cc3ad93)
|
||||
Environments: staging (für staging.rebreak.org), production (für rebreak.org)
|
||||
|
||||
Auf dem Server landet das Secret in `/etc/rebreak-handshake-watcher.env` (chmod 600).
|
||||
Diese Datei wird beim Deploy geschrieben — niemals committen.
|
||||
|
||||
Wert generieren (User-Aktion, einmalig):
|
||||
```bash
|
||||
openssl rand -hex 16
|
||||
```
|
||||
Dann in Infisical eintragen unter HANDSHAKE_SECRET.
|
||||
|
||||
## Deploy-Schritte (Reihenfolge)
|
||||
|
||||
1. User generiert HANDSHAKE_SECRET via `openssl rand -hex 16` + trägt in Infisical ein
|
||||
2. nginx-Config deployen (ops/nginx/dns.rebreak.org.conf → /etc/nginx/sites-available/)
|
||||
3. nginx -t prüfen, dann reload (User-GO nötig)
|
||||
4. Verify: `curl -v https://dns.rebreak.org/dns-query/TESTTOKEN -H "accept: application/dns-json" "?name=example.com&type=A"` → kein 404
|
||||
5. AdGuard QueryLog prüfen: CP-Feld muss "TESTTOKEN" enthalten
|
||||
6. watcher.py deployen: /opt/rebreak-handshake-watcher/watcher.py
|
||||
7. EnvironmentFile schreiben: /etc/rebreak-handshake-watcher.env (chmod 600)
|
||||
8. systemd-unit deployen: /etc/systemd/system/rebreak-handshake-watcher.service
|
||||
9. systemctl daemon-reload + systemctl enable + systemctl start (User-GO nötig)
|
||||
10. Verify: journalctl -u rebreak-handshake-watcher -f → sollte starten und tailen
|
||||
|
||||
## Datei-Übersicht
|
||||
|
||||
| Datei | Ziel auf Server | Beschreibung |
|
||||
|-------|----------------|--------------|
|
||||
| ops/nginx/dns.rebreak.org.conf | /etc/nginx/sites-available/dns.rebreak.org | nginx vhost mit prefix-match |
|
||||
| ops/mdm/adguard-handshake-watcher/watcher.py | /opt/rebreak-handshake-watcher/watcher.py | Python watcher |
|
||||
| ops/mdm/adguard-handshake-watcher/rebreak-handshake-watcher.service | /etc/systemd/system/rebreak-handshake-watcher.service | systemd unit |
|
||||
|
||||
## Verify-Checklist vor nginx -s reload
|
||||
|
||||
Siehe Abschnitt "Risiken + Verify-Checklist" in diesem Dokument.
|
||||
|
||||
## Risiken + Verify-Checklist
|
||||
|
||||
### Vor `nginx -t` + `systemctl reload nginx`
|
||||
|
||||
Reihenfolge einhalten. Keinen Schritt überspringen.
|
||||
|
||||
**1. Snapshot der aktuellen Config erstellen (rollback-Basis)**
|
||||
```bash
|
||||
ssh rebreak-mdm "cp /etc/nginx/sites-available/dns.rebreak.org /etc/nginx/sites-available/dns.rebreak.org.bak.$(date +%Y%m%d_%H%M%S)"
|
||||
```
|
||||
Existiert die Config noch nicht → kein Snapshot nötig, nur neue Datei anlegen.
|
||||
|
||||
**2. AdGuard-Port verifizieren**
|
||||
Die Config nimmt an, dass AdGuard DoH auf `127.0.0.1:3000` läuft.
|
||||
Vor dem Deploy tatsächlichen Port prüfen:
|
||||
```bash
|
||||
ssh rebreak-mdm "docker ps | grep adguard"
|
||||
ssh rebreak-mdm "ss -tlnp | grep 3000"
|
||||
# oder
|
||||
ssh rebreak-mdm "docker exec adguardhome cat /opt/adguardhome/conf/AdGuardHome.yaml | grep -A10 'bind_port\|https_port\|dns:'"
|
||||
```
|
||||
Port in dns.rebreak.org.conf anpassen falls abweichend.
|
||||
|
||||
**3. TLS-Cert-Pfad prüfen**
|
||||
```bash
|
||||
ssh rebreak-mdm "ls /etc/letsencrypt/live/dns.rebreak.org/"
|
||||
```
|
||||
Falls kein Cert für dns.rebreak.org existiert:
|
||||
```bash
|
||||
ssh rebreak-mdm "certbot --nginx -d dns.rebreak.org --dry-run"
|
||||
# erst dry-run, dann ohne --dry-run wenn ok
|
||||
```
|
||||
Rate-Limit: maximal 5 Cert-Requests pro Domain pro Woche.
|
||||
|
||||
**4. nginx -t Dry-Run**
|
||||
```bash
|
||||
ssh rebreak-mdm "nginx -t"
|
||||
```
|
||||
Muss `syntax is ok` + `test is successful` ausgeben. Bei Fehler: Config korrigieren, nicht fortfahren.
|
||||
|
||||
**5. Rollback-Plan**
|
||||
Falls nach reload DoH-Anfragen failen (DNS bricht für enrolled Geräte!):
|
||||
```bash
|
||||
ssh rebreak-mdm "cp /etc/nginx/sites-available/dns.rebreak.org.bak.<DATUM> /etc/nginx/sites-available/dns.rebreak.org && systemctl reload nginx"
|
||||
```
|
||||
|
||||
### Risiken
|
||||
|
||||
| Risiko | Auswirkung | Mitigation |
|
||||
|--------|-----------|------------|
|
||||
| Falscher AdGuard-Port | nginx gibt 502, alle DoH-Queries failen | Port vor Deploy verifizieren (Schritt 2) |
|
||||
| TLS-Cert fehlt für dns.rebreak.org | nginx startet nicht | Cert vor Deploy ausstellen (Schritt 3) |
|
||||
| Pfad-Stripping durch falsche proxy_pass-Syntax | CP bleibt leer, Handshake kommt nie | Aktuelle Config nutzt `proxy_pass http://...;` ohne Suffix — korrekt |
|
||||
| querylog-Feld CP heißt anders (Version-abhängig) | watcher erkennt ClientIDs nicht | Nach Deploy Testquery machen + `grep CP querylog.json` |
|
||||
| HANDSHAKE_SECRET in git | Credential-Leak | Secret kommt aus Infisical, EnvironmentFile ist .gitignored |
|
||||
| watcher crasht bei malformed JSON | einzelne Zeile wird übersprungen | watcher hat try/except um json.loads, kein crash |
|
||||
@ -159,6 +159,43 @@ ssh rebreak-mdm "ufw allow from 1.2.3.4 to any port 22"
|
||||
ssh rebreak-mdm "df -h && free -h && docker stats --no-stream"
|
||||
```
|
||||
|
||||
## Handshake-Watcher (DoH ClientID → Backend)
|
||||
|
||||
### Status prüfen
|
||||
```bash
|
||||
ssh rebreak-mdm "systemctl status rebreak-handshake-watcher"
|
||||
```
|
||||
|
||||
### Logs live
|
||||
```bash
|
||||
ssh rebreak-mdm "journalctl -u rebreak-handshake-watcher -f"
|
||||
```
|
||||
|
||||
### Restart
|
||||
```bash
|
||||
ssh rebreak-mdm "systemctl restart rebreak-handshake-watcher"
|
||||
```
|
||||
|
||||
### EnvironmentFile-Pfad (Secrets, chmod 600)
|
||||
```
|
||||
/etc/rebreak-handshake-watcher.env
|
||||
```
|
||||
Inhalt (nie committen, kommt aus Infisical):
|
||||
```
|
||||
HANDSHAKE_SECRET=<32hex>
|
||||
BACKEND_URL=https://staging.rebreak.org
|
||||
QUERYLOG_PATH=/opt/adguardhome/data/querylog.json
|
||||
```
|
||||
|
||||
### Watcher-Code deployen (nach Code-Änderung)
|
||||
```bash
|
||||
scp ops/mdm/adguard-handshake-watcher/watcher.py rebreak-mdm:/opt/rebreak-handshake-watcher/watcher.py
|
||||
ssh rebreak-mdm "systemctl restart rebreak-handshake-watcher"
|
||||
```
|
||||
|
||||
### Vollständige Architektur-Doku
|
||||
`ops/mdm/DOH_CLIENTID_HANDSHAKE.md`
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### nanomdm startet nicht
|
||||
|
||||
@ -0,0 +1,42 @@
|
||||
[Unit]
|
||||
Description=ReBreak AdGuard Handshake Watcher
|
||||
Documentation=https://github.com/chahinebrini/rebreak-monorepo
|
||||
# Start after network and AdGuard's docker container are up.
|
||||
After=network-online.target docker.service
|
||||
Wants=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=root
|
||||
WorkingDirectory=/opt/rebreak-handshake-watcher
|
||||
|
||||
# ── Secrets via Infisical ────────────────────────────────────────────────────
|
||||
# HANDSHAKE_SECRET must be injected at runtime.
|
||||
# On this server, load it from Infisical via a wrapper or
|
||||
# write it into /etc/rebreak-handshake-watcher.env (chmod 600, root only)
|
||||
# during deploy. The .env file is gitignored — never committed.
|
||||
#
|
||||
# Format of /etc/rebreak-handshake-watcher.env:
|
||||
# HANDSHAKE_SECRET=<32hex from Infisical>
|
||||
# BACKEND_URL=https://staging.rebreak.org
|
||||
# QUERYLOG_PATH=/opt/adguardhome/data/querylog.json
|
||||
#
|
||||
EnvironmentFile=/etc/rebreak-handshake-watcher.env
|
||||
|
||||
ExecStart=/usr/bin/python3 /opt/rebreak-handshake-watcher/watcher.py
|
||||
|
||||
# Restart on any exit (crash, SIGKILL, etc.) after 5s
|
||||
Restart=always
|
||||
RestartSec=5s
|
||||
|
||||
# Logging goes to journald automatically (no extra config needed)
|
||||
StandardOutput=journal
|
||||
StandardError=journal
|
||||
SyslogIdentifier=rebreak-handshake-watcher
|
||||
|
||||
# Harden: no new privileges, read-only filesystem except for runtime state
|
||||
NoNewPrivileges=true
|
||||
PrivateTmp=true
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
296
ops/mdm/adguard-handshake-watcher/watcher.py
Normal file
296
ops/mdm/adguard-handshake-watcher/watcher.py
Normal file
@ -0,0 +1,296 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
adguard-handshake-watcher
|
||||
=========================
|
||||
Tails AdGuard Home's querylog.json (rotated) and fires a POST to the
|
||||
rebreak backend's /api/devices/protected/handshake endpoint whenever
|
||||
a DNS query contains a non-empty ClientID (= dnsToken of a protected device).
|
||||
|
||||
Environment variables (required / optional):
|
||||
HANDSHAKE_SECRET — shared secret, sent as x-handshake-secret header
|
||||
(required; provisioned via Infisical, see RUNBOOK)
|
||||
BACKEND_URL — base URL of the backend, no trailing slash
|
||||
default: https://staging.rebreak.org
|
||||
QUERYLOG_PATH — path to AdGuard's querylog.json
|
||||
default: /opt/adguardhome/data/querylog.json
|
||||
|
||||
Per-token in-memory cooldown: 60 seconds.
|
||||
Only one POST is fired per token per minute even if the browser hammers DoH
|
||||
at 10+ req/s. This keeps backend write-pressure negligible.
|
||||
|
||||
Log-rotation safety:
|
||||
AdGuard rotates querylog.json by renaming and creating a new file.
|
||||
The watcher detects EOF + inode change and re-opens the new file.
|
||||
Polling interval: 1 second.
|
||||
"""
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
import urllib.request
|
||||
import urllib.error
|
||||
|
||||
# ── Logging ──────────────────────────────────────────────────────────────────
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s [%(levelname)s] %(message)s",
|
||||
datefmt="%Y-%m-%dT%H:%M:%SZ",
|
||||
)
|
||||
log = logging.getLogger("handshake-watcher")
|
||||
|
||||
# ── Config ───────────────────────────────────────────────────────────────────
|
||||
|
||||
HANDSHAKE_SECRET: str = os.environ.get("HANDSHAKE_SECRET", "")
|
||||
BACKEND_URL: str = os.environ.get("BACKEND_URL", "https://staging.rebreak.org").rstrip("/")
|
||||
QUERYLOG_PATH: str = os.environ.get("QUERYLOG_PATH", "/opt/adguardhome/data/querylog.json")
|
||||
|
||||
COOLDOWN_SECONDS: int = 60 # minimum gap between two POSTs for the same token
|
||||
POLL_INTERVAL: float = 1.0 # seconds between file-tail polls
|
||||
|
||||
if not HANDSHAKE_SECRET:
|
||||
log.error("HANDSHAKE_SECRET env var is not set — cannot authenticate to backend. Exiting.")
|
||||
sys.exit(1)
|
||||
|
||||
# ── State ────────────────────────────────────────────────────────────────────
|
||||
|
||||
# token -> unix timestamp of last successful POST
|
||||
_last_fired: dict[str, float] = {}
|
||||
|
||||
|
||||
def _cooldown_ok(token: str) -> bool:
|
||||
"""Returns True if we have not fired for this token within the cooldown window."""
|
||||
last = _last_fired.get(token)
|
||||
if last is None:
|
||||
return True
|
||||
return (time.monotonic() - last) >= COOLDOWN_SECONDS
|
||||
|
||||
|
||||
def _mark_fired(token: str) -> None:
|
||||
_last_fired[token] = time.monotonic()
|
||||
|
||||
|
||||
# ── HTTP ─────────────────────────────────────────────────────────────────────
|
||||
|
||||
def post_handshake(token: str) -> None:
|
||||
"""
|
||||
POST /api/devices/protected/handshake
|
||||
Body: { "token": "<32hex>" }
|
||||
Header: x-handshake-secret: <HANDSHAKE_SECRET>
|
||||
|
||||
Handles gracefully:
|
||||
401 — secret wrong (log error, do not crash)
|
||||
404 — token unknown (log warning, mark fired to avoid spam)
|
||||
5xx — backend error (log error, do NOT mark fired so we retry next poll)
|
||||
"""
|
||||
url = f"{BACKEND_URL}/api/devices/protected/handshake"
|
||||
payload = json.dumps({"token": token}).encode("utf-8")
|
||||
req = urllib.request.Request(
|
||||
url,
|
||||
data=payload,
|
||||
method="POST",
|
||||
headers={
|
||||
"Content-Type": "application/json",
|
||||
"x-handshake-secret": HANDSHAKE_SECRET,
|
||||
},
|
||||
)
|
||||
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=10) as resp:
|
||||
body = resp.read().decode("utf-8", errors="replace")
|
||||
data = json.loads(body) if body else {}
|
||||
if data.get("ignored"):
|
||||
log.debug("token=%s: backend ignored (revoked/inactive)", token)
|
||||
else:
|
||||
status_changed = data.get("statusChanged", False)
|
||||
status = data.get("status", "?")
|
||||
if status_changed:
|
||||
log.info("token=%s: status → %s (changed)", token, status)
|
||||
else:
|
||||
log.debug("token=%s: lastDnsQueryAt updated, status=%s", token, status)
|
||||
_mark_fired(token)
|
||||
|
||||
except urllib.error.HTTPError as exc:
|
||||
body_bytes = exc.read() if exc.fp else b""
|
||||
body_str = body_bytes.decode("utf-8", errors="replace")
|
||||
|
||||
if exc.code == 401:
|
||||
log.error(
|
||||
"token=%s: 401 UNAUTHORIZED — HANDSHAKE_SECRET mismatch. "
|
||||
"Check Infisical secret. body=%s",
|
||||
token, body_str,
|
||||
)
|
||||
# Still mark fired — no point spamming a broken secret every second.
|
||||
_mark_fired(token)
|
||||
|
||||
elif exc.code == 404:
|
||||
log.warning(
|
||||
"token=%s: 404 TOKEN_NOT_FOUND — token not in DB yet (pending provisioning?). "
|
||||
"Will retry after cooldown. body=%s",
|
||||
token, body_str,
|
||||
)
|
||||
# Do NOT mark fired — let it retry after normal cooldown expires naturally.
|
||||
# Actually mark fired to avoid per-second hammering on unknown tokens:
|
||||
_mark_fired(token)
|
||||
|
||||
else:
|
||||
# 5xx or other — log but do NOT mark fired so we retry sooner
|
||||
log.error(
|
||||
"token=%s: HTTP %d from backend. Will retry. body=%s",
|
||||
token, exc.code, body_str,
|
||||
)
|
||||
|
||||
except (urllib.error.URLError, OSError, json.JSONDecodeError) as exc:
|
||||
log.error("token=%s: request failed (%s). Will retry.", token, exc)
|
||||
|
||||
|
||||
# ── QueryLog parsing ─────────────────────────────────────────────────────────
|
||||
#
|
||||
# AdGuard Home writes querylog.json as a sequence of newline-delimited JSON
|
||||
# objects (NDJSON), one per line. Each object looks like:
|
||||
#
|
||||
# {
|
||||
# "T": "2026-05-15T12:34:56.789Z", // timestamp (ISO8601)
|
||||
# "QH": "example.com", // queried hostname
|
||||
# "QT": "A", // query type
|
||||
# "QC": "IN", // query class
|
||||
# "CP": "abc123def456", // ClientID (the dnsToken, if path-based CID)
|
||||
# "Result": { ... },
|
||||
# "Elapsed": 123456,
|
||||
# "IP": "127.0.0.1",
|
||||
# ...
|
||||
# }
|
||||
#
|
||||
# The "CP" field ("Client Protocol" / ClientID parameter) is set by AdGuard
|
||||
# when a ClientID is embedded in the DNS-over-HTTPS URL path:
|
||||
# /dns-query/<clientid>
|
||||
#
|
||||
# References:
|
||||
# https://adguard-dns.io/kb/general/dns-filtering-syntax/
|
||||
# AdGuard Home source: querylog/querylog_file.go, dnsforward/client_id.go
|
||||
#
|
||||
# NOTE: field name is "CP" in AdGuard Home's querylog JSON serialization
|
||||
# (as of AdGuard Home v0.107.x). If the field appears empty or absent,
|
||||
# double-check by tailing the actual querylog after a test query:
|
||||
# docker exec adguardhome tail -f /opt/adguardhome/data/querylog.json
|
||||
# and doing: curl -s https://dns.rebreak.org/dns-query/TESTTOKEN -H "accept: application/dns-json" "?name=example.com&type=A"
|
||||
|
||||
def extract_client_id(line: str) -> Optional[str]:
|
||||
"""
|
||||
Parse one NDJSON line from querylog.json.
|
||||
Returns the ClientID string if non-empty, else None.
|
||||
"""
|
||||
line = line.strip()
|
||||
if not line:
|
||||
return None
|
||||
try:
|
||||
entry = json.loads(line)
|
||||
except json.JSONDecodeError:
|
||||
return None
|
||||
|
||||
cid = entry.get("CP", "")
|
||||
if isinstance(cid, str) and cid.strip():
|
||||
return cid.strip()
|
||||
return None
|
||||
|
||||
|
||||
# ── File tailing with rotation detection ─────────────────────────────────────
|
||||
|
||||
class RotationSafeTailer:
|
||||
"""
|
||||
Tails a file line-by-line. Detects log rotation by monitoring inode.
|
||||
On rotation: waits one poll cycle (to let AdGuard finish writing the
|
||||
renamed file), then opens the new file from offset 0.
|
||||
"""
|
||||
|
||||
def __init__(self, path: str) -> None:
|
||||
self.path = Path(path)
|
||||
self._file = None
|
||||
self._inode: Optional[int] = None
|
||||
self._open()
|
||||
|
||||
def _open(self) -> None:
|
||||
if self._file:
|
||||
try:
|
||||
self._file.close()
|
||||
except OSError:
|
||||
pass
|
||||
self._file = None
|
||||
|
||||
try:
|
||||
self._file = open(self.path, "r", encoding="utf-8", errors="replace")
|
||||
self._inode = self.path.stat().st_ino
|
||||
# Seek to end on initial open (don't replay old history).
|
||||
# On rotation, we re-open from offset 0 to catch new entries.
|
||||
log.info("Opened querylog: %s (inode=%d)", self.path, self._inode)
|
||||
except FileNotFoundError:
|
||||
log.warning("querylog not found: %s — will retry", self.path)
|
||||
self._file = None
|
||||
self._inode = None
|
||||
|
||||
def _seek_to_end_on_first_open(self) -> None:
|
||||
"""Call once after initial _open() to skip historical entries."""
|
||||
if self._file:
|
||||
self._file.seek(0, 2) # SEEK_END
|
||||
|
||||
def readline(self) -> Optional[str]:
|
||||
"""
|
||||
Returns the next line or None if no new data.
|
||||
Handles rotation transparently.
|
||||
"""
|
||||
if self._file is None:
|
||||
self._open()
|
||||
return None
|
||||
|
||||
line = self._file.readline()
|
||||
if line:
|
||||
return line
|
||||
|
||||
# EOF — check for rotation
|
||||
try:
|
||||
current_inode = self.path.stat().st_ino
|
||||
except FileNotFoundError:
|
||||
log.warning("querylog disappeared (rotation in progress?) — will re-open")
|
||||
self._open()
|
||||
return None
|
||||
|
||||
if current_inode != self._inode:
|
||||
log.info("querylog rotation detected (inode %d -> %d) — re-opening", self._inode, current_inode)
|
||||
self._open()
|
||||
# Don't seek to end on rotation — read from beginning to catch
|
||||
# any entries written right after rotation.
|
||||
|
||||
return None
|
||||
|
||||
|
||||
# ── Main loop ─────────────────────────────────────────────────────────────────
|
||||
|
||||
def main() -> None:
|
||||
log.info(
|
||||
"Starting handshake-watcher | backend=%s | querylog=%s | cooldown=%ds",
|
||||
BACKEND_URL, QUERYLOG_PATH, COOLDOWN_SECONDS,
|
||||
)
|
||||
|
||||
tailer = RotationSafeTailer(QUERYLOG_PATH)
|
||||
tailer._seek_to_end_on_first_open()
|
||||
|
||||
while True:
|
||||
line = tailer.readline()
|
||||
|
||||
if line:
|
||||
token = extract_client_id(line)
|
||||
if token and _cooldown_ok(token):
|
||||
log.info("Firing handshake for token=%s", token)
|
||||
post_handshake(token)
|
||||
else:
|
||||
time.sleep(POLL_INTERVAL)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
70
ops/nginx/dns.rebreak.org.conf
Normal file
70
ops/nginx/dns.rebreak.org.conf
Normal file
@ -0,0 +1,70 @@
|
||||
# nginx vhost: dns.rebreak.org
|
||||
# Deployed on: rebreak-mdm (178.105.101.137)
|
||||
# TLS termination for AdGuard Home DoH endpoint.
|
||||
#
|
||||
# CRITICAL: location uses prefix-match (^~) NOT exact-match (=).
|
||||
# AdGuard Home parses the ClientID from the URL path natively:
|
||||
# /dns-query -> normal DoH query (no ClientID)
|
||||
# /dns-query/<cid> -> DoH query, AdGuard extracts <cid> into QueryLog.ClientID
|
||||
#
|
||||
# The full path MUST be forwarded to AdGuard unchanged (no $uri stripping).
|
||||
# AdGuard reads the path segment after /dns-query/ as the ClientID.
|
||||
# Stripping it (e.g. proxy_pass http://.../ without path) would break CID detection.
|
||||
#
|
||||
# AdGuard Home listens on 127.0.0.1:3000 (HTTPS UI) and plain DNS-over-HTTPS
|
||||
# on a dedicated port. Verify actual DoH port on server:
|
||||
# docker exec adguardhome cat /opt/adguardhome/conf/AdGuardHome.yaml | grep -A5 dns:
|
||||
# Common defaults: port 3000 for UI+DoH combined, or separate port 5353/8053.
|
||||
# Adjust proxy_pass port below to match actual AdGuard DoH port.
|
||||
#
|
||||
# Current assumption: AdGuard DoH on 127.0.0.1:3000 (same as UI, AdGuard's default).
|
||||
# If AdGuard runs in docker: verify with `docker ps | grep adguard`.
|
||||
|
||||
server {
|
||||
listen 80;
|
||||
server_name dns.rebreak.org;
|
||||
return 301 https://dns.rebreak.org$request_uri;
|
||||
}
|
||||
|
||||
server {
|
||||
listen 443 ssl http2;
|
||||
server_name dns.rebreak.org;
|
||||
|
||||
ssl_certificate /etc/letsencrypt/live/dns.rebreak.org/fullchain.pem;
|
||||
ssl_certificate_key /etc/letsencrypt/live/dns.rebreak.org/privkey.pem;
|
||||
include /etc/letsencrypt/options-ssl-nginx.conf;
|
||||
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;
|
||||
|
||||
# ── DoH endpoint — prefix match, path forwarded unchanged ───────────────
|
||||
# ^~ wins over regex locations. Catches both:
|
||||
# /dns-query (plain DoH)
|
||||
# /dns-query/abc1 (DoH with ClientID — AdGuard parses the suffix)
|
||||
#
|
||||
# proxy_pass terminates without trailing slash so $request_uri is appended
|
||||
# as-is, preserving /dns-query/<cid> verbatim.
|
||||
location ^~ /dns-query {
|
||||
proxy_pass http://127.0.0.1:3000;
|
||||
proxy_http_version 1.1;
|
||||
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header X-Forwarded-Proto $scheme;
|
||||
|
||||
# DoH requests are short-lived; tight timeouts are fine.
|
||||
proxy_connect_timeout 5s;
|
||||
proxy_send_timeout 10s;
|
||||
proxy_read_timeout 10s;
|
||||
}
|
||||
|
||||
# ── Health check (for monitoring / GH Actions deploy verify) ────────────
|
||||
location /health {
|
||||
return 200 "OK\n";
|
||||
add_header Content-Type text/plain;
|
||||
}
|
||||
|
||||
# ── Block everything else ────────────────────────────────────────────────
|
||||
location / {
|
||||
return 404;
|
||||
}
|
||||
}
|
||||
Loading…
x
Reference in New Issue
Block a user