Operations

This page is for day-to-day operation of the production dashboard.

Dashboard Actions

Start:

Validates configured or supplied links.
Optionally kills existing unmanaged stream processes.
Launches the configured stream.command.
Captures stdout and stderr into recent logs.
Records an event and process metadata.

Stop:

Sends SIGTERM to the managed process group.
Sends SIGKILL if the process does not exit within the timeout.
Records the last exit code.

Restart:

Stops the current managed process.
Starts a new managed process using current config and links.

Watchdog Behavior

The watchdog runs inside the Starlette service. When stream.auto_recover is true, it checks every few seconds for:

managed process exit
missing ffmpeg child after startup grace
missing playlist after startup grace
stale playlist older than playlist_stale_seconds

The watchdog respects watchdog_restart_cooldown to avoid rapid restart loops. If no links are configured, it skips restart and records a warning.

HLS Health

The app reads the configured output directory and checks:

ufc.m3u8 existence
playlist line count
playlist age
target duration
media sequence
segment names
segment count
total segment bytes
last segment size
media sequence movement
bytes and segment deltas between scorer samples
recent stream errors
ffmpeg progress evidence

/api/health returns:

200 when a managed process is running, a playlist is ready, and the playlist is fresh.
503 when the stream is stopped, still starting, missing playlist output, or stale.

The stream health scorer waits at least min_assessment_seconds before confirming failure, then requires confirmed_failure_samples bad samples. This prevents a normal startup window or a quiet segment interval from being treated as a hard failure.

Logs And Events

The status payload exposes recent events, logs, and errors from in-memory deques. ArangoDB stores durable operational records when enabled.

Use systemd logs for service-level failures:

journalctl -u obbystreams.service -f

Use the dashboard logs for stream-level failures from the transcoder and ffmpeg.

GPU Telemetry

The dashboard polls /api/nvidia-smi every five seconds. The backend caches NVIDIA SMI collection for the same interval so the UI can stay fresh without spawning unnecessary GPU probes.

GPU telemetry is best-effort. Hosts without NVIDIA drivers return a structured degraded payload instead of failing the dashboard.

Deploy Routine

Pull or unpack the new release.
Run npm ci && npm run build if building from source.
Copy files to /opt/obbystreams.
Run uv sync --no-dev --frozen as the service user.
Merge any new example config keys into the live config.
Run sudo systemctl restart obbystreams.service.
Check /api/health, the dashboard, and journalctl.

Useful Commands

sudo systemctl status obbystreams.service --no-pager
journalctl -u obbystreams.service --no-pager -n 120
curl -i http://127.0.0.1:8767/api/health
curl -I https://s.obby.ca/
curl -sS http://127.0.0.1:8767/api/nvidia-smi -H 'x-obbystreams-token: TOKEN'
sudo nginx -t
ps aux | rg 'obbystreams|ufc|ffmpeg'
ls -lah /var/www/live.obnoxious.lol/stream

Recovery Notes

If the dashboard is up but health is 503, inspect stream links and HLS output.
If nginx returns 502, inspect systemd status and confirm uvicorn is bound to 127.0.0.1:8767.
If the player is blank, check /hls/ufc.m3u8 and browser network requests.
If ArangoDB is offline, the dashboard should still operate but persistence will be degraded.