Architecture
qast is built around a single pipeline that normalizes any input into a continuous, TV-compatible stream. This page covers the key stages, design decisions, and protocol internals.
Pipeline overview
[Source] → [Resolve] → [Transcode] → [TS Rewrite] → [Mux] → [Buffer] → [HTTP] → [TV]
Stage 1: Source resolution
The source parser (qast/source.py) classifies each input and routes it appropriately:
- URLs are resolved via yt-dlp, which extracts the direct media stream. For YouTube DASH, separate video and audio streams are selected for maximum quality.
- Local files pass through directly.
- Capture sources (
screen,window:Title,webcam) use platform-specific ffmpeg input devices. - Browser sources launch headless Chromium via Playwright and capture the rendered page.
Stage 2: Transcoding
Every input is transcoded to a universal format via ffmpeg:
| Parameter | Value |
|---|---|
| Video codec | H.264 Main profile |
| Resolution | 1920x1080 @ 30fps |
| Preset | ultrafast |
| Bitrate | 5 Mbps |
| GOP | 60 frames (2 seconds) |
| Audio codec | AAC stereo, 44.1 kHz, 128 kbps |
| Container | MPEG-TS (stdout) |
The always-transcode design trades CPU for zero compatibility issues. Every TV that supports H.264 Main (which is all of them) will play the output.
Stage 3: PTS/DTS rewriting
The TS rewriter (qast/pipeline/tsrewrite.py) processes 188-byte MPEG-TS packets and rewrites PTS, DTS, and PCR timestamps for continuity. This ensures seamless transitions between queue items — the TV sees one unbroken stream even when the underlying source changes.
Timestamps use 90 kHz integer ticks (no floating point) to avoid drift.
Stage 4: Container muxing
- DLNA & Roku: Raw MPEG-TS is served directly.
- Chromecast: A long-lived ffmpeg process remuxes MPEG-TS into fragmented MP4 (
-c copy, no re-encoding).
Stage 5: Ring buffer
The ring buffer (qast/pipeline/ringbuf.py) is an in-memory circular buffer with time-based flow control:
- Capacity: 64 MB max, 4 MB min (adjusted for capture sources).
- Frame counting: Video frames are parsed from MPEG-TS PES headers (stream IDs 0xE0–0xEF) to compute content-time independently of bitrate.
- Throttling: When content-time leads wall-time by more than 10 seconds, the encoder is throttled to prevent unbounded memory growth.
Stage 6: HTTP server
A ThreadingHTTPServer streams the buffer contents to the TV over HTTP. It handles:
- Range requests and fake Content-Length (100 GB) for DLNA compatibility.
- Disconnect detection via
BrokenPipeError. - Concurrent connections for multi-device casting.
Device discovery
Discovery runs in parallel across all protocols:
| Protocol | Method | Details |
|---|---|---|
| DLNA | SSDP | UDP multicast to 239.255.255.250:1900 |
| Roku | SSDP | Same multicast, filtered by Roku service type |
| Chromecast | mDNS/DNS-SD | Multicast to 224.0.0.251:5353 via pychromecast |
Results are deduplicated and sorted alphabetically. Adaptive timeout learning (qast/tuning.py) tracks consecutive discovery failures and extends timeouts dynamically.
Protocol casting
DLNA
SOAP/UPnP: SetAVTransportURI followed by Play. Polling monitors playback state.
Chromecast
Protobuf over TLS via pychromecast. Media controller launches a default media receiver that loads the HTTP stream URL.
Roku
HTTP POST to the ECP endpoint. Requires the free Media Assistant channel (ID 782875) to be installed on the Roku device.
YouTube DASH handling
By default, qast selects separate video and audio DASH streams for higher quality. The audio stream is downloaded to a temp file (~1–2 MB) to work around an ffmpeg HTTP multi-input bug. If the audio download fails, qast automatically falls back to a muxed stream.
Design decisions
- Always transcode: Guarantees compatibility. The alternative — format detection and selective transcoding — is fragile and produces inconsistent results across TVs.
- Frame-based timing: Counting video frames is more accurate than estimating from bitrate. This enables precise placeholder durations, readiness detection, and elapsed-time display.
- Single continuous stream: PTS rewriting makes the TV see one unbroken stream. This avoids reconnection delays and buffering spinners between queue items.
- In-memory buffer: No temp files. Everything flows through memory with explicit back-pressure.
Source reference
- architecture.md — deep technical dive (20 KB)
- tv-protocols.md — protocol documentation
- qast-screen-capture-spec.md — capture specification
Created by Rich LeGrand · MIT License