Siphon: Building a Zero-Trust Edge Gateway for Home Assistant
Overview
The traditional "secure" answer is to use Tailscale or WireGuard. That works great for you, the admin, but you can't install a Tailscale client on a closed-ecosystem third-party API.
We need a DMZ. We need a hardened, isolated "virtual device" exposed to the internet that can catch external webhooks, ruthlessly validate them, strip out the garbage, and push only the sanitized data into Home Assistant.
That will be Siphon. Currently it is a lightweight Go daemon that acts as simple pipeline for sensor data, but I plan to extend it to act as public-facing gateway, completely airgapping Home Assistant from the public web.
In the Lab: The Airgapped Ingress Architecture
Siphon is currently undergoing a massive v2 architectural rewrite. I'm building a linear, declarative pipeline based on a Directed Acyclic Graph (DAG) that is designed specifically to act as a secure ingress point.
Instead of opening port 8123 and letting the world poke at your Home Assistant UI, you expose Siphon's webhook collector (via a Cloudflare Tunnel, Tailscale Funnel or some kind of strict Nginx reverse proxy - there could be some secure options to expose simple webhook to reduce the attack surface).
Here is what the internal data flow looks like for catching a public webhook and safely routing it to the local network:
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#1a1a1a', 'edgeLabelBackground':'#1a1a1a', 'tertiaryColor': '#94a3b8', 'lineColor': '#94a3b8', 'textColor': '#94a3b8'}}}%%
graph LR
classDef public fill:#2a1215,stroke:#7f1d1d,stroke-width:1px,color:#ef4444;
classDef collector fill:#0f172a,stroke:#2563eb,stroke-width:1px,color:#60a5fa;
classDef processor fill:#281809,stroke:#b45309,stroke-width:1px,color:#fbbf24;
classDef sink fill:#064e3b,stroke:#059669,stroke-width:1px,color:#34d399;
classDef internal fill:#1a1a1a,stroke:#334155,stroke-dasharray: 5 5,color:#94a3b8;
classDef secure fill:#1a1a1a,stroke:#059669,stroke-width:1px,color:#94a3b8;
subgraph Zone1 ["🌐 INTERNET"]
direction LR
PublicNet[["untrusted<br/>webhooks"]]:::public
end
subgraph Zone2 ["⚙️ DMZ (SIPHON DAEMON)"]
direction LR
Collect1("🏰<br/>webhook<br/>collector"):::collector
BusVolatile[("⚡<br/>event<br/>bus")]:::internal
Proc("🔪<br/>processors<br/>('expr' extract<br/>and filter)"):::processor
Sink1("🌉<br/>MQTT<br/>sink"):::sink
Collect1 --> BusVolatile
BusVolatile --> Proc
Proc -->|valid<br/>payload| Sink1
end
subgraph Zone3 ["🔒 SECURE LOCAL NETWORK (No Ingress)"]
direction LR
Broker[("📡<br/>local MQTT<br/>broker")]:::secure -->|Auto-Discovery| HA[("🏠<br/>Home<br/>Assistant")]:::secure
end
PublicNet -->|https| Collect1
Sink1 -->|MQTT<br/>over TLS| Broker
When this pipeline is running, data flows left-to-right through an isolated goroutine:
- The Bastion (Webhook Collector): Siphon exposes a simple HTTP listener. It requires a hardcoded Bearer token. If the token is missing or wrong, the connection is instantly dropped. No database lookups, no complex authentication flows to exploit.
- The Knife (Processors): This is where Siphon protects your internal state. Using the embedded
exprengine, you explicitly define which JSON keys are allowed to pass through. If an external service tries to inject a massive 5MB payload or malformed arrays, thefilterprocessor drops it on the floor. - The Bridge (MQTT Sink): Once the data is verified, sanitized, and reformatted, Siphon pushes it securely to your local MQTT broker. It will handle the auto-discovery mechanism and expose itself as a "virtual" device to HA, which can be added to any automations.
Home Assistant simply subscribes to the local MQTT broker. It never talks to the internet, and the internet never talks to it. Siphon absorbs all the risk at the edge.
The Roadmap
While the core memory-bus is actively being re-architected, I'm laying the groundwork for features required to make this a piece of true infrastructure:
1. Write-Ahead Log (WAL) for Zero Data Loss If you're using Siphon to catch important external webhooks (like GPS location updates), you can't afford to lose them if your internal MQTT broker restarts for an update. We need a durable persistence engine. If the internal network drops, Siphon spools the incoming public webhooks to disk and fires them off when the local broker comes back online.
2. The HA Add-on (The Escape Hatch) Running this in an isolated Docker container in a DMZ VLAN is the purist way, but I want it accessible. The plan is to package Siphon as a Home Assistant Add-on. The goal: manage your public gateway rules from the HA sidebar without SSH.
3. WebAssembly (Wasm) Plugins The expr engine handles most JSON sanitation, but sometimes you need real logic to
decode/decrypt proprietary binary payloads before they hit MQTT. You will be able to write custom decoders in Rust or
TinyGo, compile them to .wasm, and hot-load them into the gateway pipeline.
The Code
Siphon is open-source and very much a work in progress. If you want to lock down your smart home but still need a way to pipe external data into your ecosystem, check out the repository and feel free to fork and hopefully open a PR!
mekops-labs/siphonProject page is available here