Skip to content

feat: migrate config system from dotenv to koanf/YAML#102

Open
rgarcia wants to merge 18 commits intomainfrom
koanf-yaml-config
Open

feat: migrate config system from dotenv to koanf/YAML#102
rgarcia wants to merge 18 commits intomainfrom
koanf-yaml-config

Conversation

@rgarcia
Copy link
Contributor

@rgarcia rgarcia commented Feb 15, 2026

Summary

Migrate server configuration from godotenv/.env files to koanf with nested YAML config files, enabling a zero-config local installation experience.

Companion PR

What changed

Config system (cmd/api/config/config.go):

  • Replaced godotenv with koanf for config loading
  • Config struct now uses nested sub-structs (e.g. CaddyConfig, NetworkConfig, ACMEConfig, OtelConfig, etc.) — config keys are organized hierarchically in both the Go struct and YAML files
  • Config precedence: env vars > YAML file > built-in defaults
  • Searches platform-specific paths (~/.config/hypeman/config.yaml on macOS, /etc/hypeman/config.yaml on Linux)
  • Explicit path via CONFIG_PATH env var; errors if explicitly specified path is missing

Environment variable convention:

  • Top-level keys: PORT, DATA_DIR, JWT_SECRET, ENV
  • Nested keys use __ (double underscore) as separator: CADDY__LISTEN_ADDRESS, NETWORK__BRIDGE_NAME, OTEL__ENABLED, etc.
  • Breaking change: old flat env var names (e.g. CADDY_LISTEN_ADDRESS, BRIDGE_NAME) no longer work — use the __ convention or switch to config.yaml

Token tool (cmd/gen-jwt/main.go):

  • Reads jwt_secret directly from server config.yaml (no wrapper scripts)
  • Respects CONFIG_PATH env var, same as the server
  • Added -duration flag for configurable token expiry
  • JWT_SECRET env var still works as highest-precedence override

Install script (scripts/install.sh):

  • Generates nested config.yaml instead of dotenv-style config files
  • Generates ~/.config/hypeman/cli.yaml with pre-authenticated long-lived token
  • Removed all wrapper scripts for both hypeman CLI and hypeman-token
  • Updated launchd/systemd service configs to use CONFIG_PATH
  • Robust jwt_secret parsing from existing configs (handles whitespace, quotes)

Example files:

  • Added config.example.yaml (Linux), config.example.darwin.yaml (macOS), cli.example.yaml
  • Removed .env.example and .env.darwin.example

Codebase-wide refactor:

  • Updated all config field access across lib/providers, lib/network, lib/resources, cmd/api/main.go, and all test files to use nested paths (e.g. cfg.Caddy.ListenAddress, cfg.Network.BridgeName)

Test plan

  • make dev starts successfully reading from config.yaml
  • hypeman-token generates tokens without JWT_SECRET env var (reads from config.yaml)
  • Nested env vars work: CADDY__LISTEN_ADDRESS=127.0.0.1 overrides caddy.listen_address
  • Top-level env vars still work: PORT=9090 overrides port: 8080
  • CONFIG_PATH=/custom/path.yaml works; missing path returns error
  • scripts/install.sh generates valid nested config.yaml and cli.yaml
  • e2e install test passes

Note

High Risk
High risk because it replaces the server’s configuration loading/precedence and refactors many call sites (network/ingress/resources/build limits), which can break deployments if keys/paths or defaults don’t match prior env-based behavior.

Overview
Migrates Hypeman server configuration from dotenv-style env files to nested YAML configs loaded via koanf. Config now comes from built-in defaults + optional config.yaml (searched via CONFIG_PATH or standard paths) with env vars overriding via __ nesting, and the config struct is reorganized into nested sections (e.g. network, caddy, acme, limits, oversubscription, capacity).

Updates runtime wiring and core subsystems to use the new nested config fields (OTel init, ingress/Caddy/ACME, build/registry settings, resource limits/oversubscription/capacity, and network manager). Installation and E2E flows are updated to generate/install config.yaml, pass CONFIG_PATH to launchd/systemd, drop wrapper scripts in favor of symlinks, generate a ~/.config/hypeman/cli.yaml with a long-lived token, and enhance hypeman-token to read jwt_secret from YAML plus a configurable -duration.

Written by Cursor Bugbot for commit 673f652. This will update automatically on new commits. Configure here.

Replace godotenv-based .env config loading with koanf and YAML config
files across the entire stack:

Server (hypeman-api):
- Config struct now uses koanf tags for YAML unmarshaling
- Loads config.yaml from platform-specific paths with env var overrides
- CONFIG_PATH env var for explicit config file location

Token tool (hypeman-token):
- Reads jwt_secret directly from config.yaml (no more wrapper scripts)
- Added -duration flag for configurable token expiry

Install script:
- Generates config.yaml instead of dotenv-style config
- Generates ~/.config/hypeman/cli.yaml with pre-authenticated token
- Removed all wrapper scripts, replaced with symlinks on Linux
- Updated launchd/systemd service definitions

Also adds config.example.yaml, config.darwin.example.yaml, and
cli.example.yaml as reference templates.
- Pass JWT_SECRET explicitly to hypeman-token in install.sh (fixes
  Linux installs where config.yaml is root-only)
- Return error from config.Load() when explicit CONFIG_PATH fails
  instead of silently falling back to defaults
- Deduplicate config paths: gen-jwt now imports and uses
  config.GetDefaultConfigPaths() instead of maintaining its own copy
@cursor

This comment has been minimized.

@cursor

This comment has been minimized.

Use env.ProviderWithValue instead of env.Provider so that empty
environment variables (e.g. PORT="") don't override valid defaults
or YAML config values. This preserves the old getEnv() behavior.
@cursor

This comment has been minimized.

- hypeman-token now checks CONFIG_PATH env var before default paths,
  matching hypeman-api behavior for custom config locations
- install.sh uses $SUDO when reading jwt_secret from existing config
  file on Linux reinstalls (file is 640 root:root)
@cursor

This comment has been minimized.

…orkflow

- Make jwt_secret and port grep/sed pipelines handle leading whitespace,
  single/double quotes, trailing whitespace, and multiple matches
- Update make gen-jwt to auto-detect local config.yaml via CONFIG_PATH,
  restoring the dev workflow that previously relied on godotenv/.env

Addresses bugbot review comments on #102.
@cursor

This comment has been minimized.

- Replace explicit envKeyMap with koanf's __ delimiter for auto-mapping
  env vars to nested config paths (e.g. CADDY__LISTEN_ADDRESS -> caddy.listen_address)
- Rename config.darwin.example.yaml -> config.example.darwin.yaml for
  consistent naming
- Remove .env.example and .env.darwin.example references
- Update DEVELOPMENT.md to document __ convention and YAML-first config
- Update README.md configuration table

BREAKING: Old flat env var names (CADDY_LISTEN_ADDRESS, BRIDGE_NAME, etc.)
no longer work. Use double-underscore for nested keys (CADDY__LISTEN_ADDRESS,
NETWORK__BRIDGE_NAME) or configure via config.yaml instead.
README and DEVELOPMENT.md now document configuration using YAML key
names (dot notation for nested keys). Env var override convention
mentioned once as a footnote rather than being the primary reference.
The macOS BSD sed `a\` (append) command was inserting the docker_socket
line on the same line as builder_image, producing invalid YAML. Fix by:
1. Including docker_socket in example config templates so the simpler
   sed s| replacement path is used instead of sed a\.
2. Fixing the fallback sed a\ to use BSD-compatible s| with literal
   newline.
Also updates builder_image default to "none" in example files.
The released CLI (v0.11.0) doesn't support cli.yaml yet, so the e2e
test needs to export HYPEMAN_BASE_URL and HYPEMAN_API_KEY env vars.
Once the CLI release with koanf/yaml support ships, cli.yaml will
handle this and the env vars become redundant.
Add CLI_BRANCH env var to install.sh that clones and builds the CLI
from the specified branch of kernel/hypeman-cli instead of downloading
a release binary. Useful for testing unreleased CLI features.

The e2e test now passes CLI_BRANCH through to install.sh. Temporarily
set to koanf-yaml-config in CI so the e2e test uses the CLI with
cli.yaml config file support.
Missed gpu_module_test.go, gpu_e2e_test.go, and gpu_inference_test.go
during the config nesting migration. Uses config.NetworkConfig for
BridgeName, SubnetCIDR, and DNSServer fields.
run: brew list caddy &>/dev/null || brew install caddy
- name: Run E2E install test
run: bash scripts/e2e-install-test.sh
run: CLI_BRANCH=koanf-yaml-config bash scripts/e2e-install-test.sh
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI hardcodes temporary CLI branch name

High Severity

The e2e install test hardcodes CLI_BRANCH=koanf-yaml-config, referencing a development branch in the companion CLI repository. Once this PR and the companion CLI PR are merged and the koanf-yaml-config branch is deleted, the git clone --branch in install.sh will fail, breaking the CI e2e install test on every subsequent run.

Fix in Cursor Fix in Web

@cursor

This comment has been minimized.

run: brew list caddy &>/dev/null || brew install caddy
- name: Run E2E install test
run: bash scripts/e2e-install-test.sh
run: CLI_BRANCH=koanf-yaml-config bash scripts/e2e-install-test.sh
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: revert after CLI PR is merged / deployed

The inline heredoc configs duplicated defaults defined in the example
YAML files. Treat a failed config template download as a hard error
instead of silently generating a potentially stale config.
Without jwt_secret the server can't authenticate API requests, so
silently skipping it leaves the install in a broken state.
@rgarcia rgarcia requested a review from sjmiller609 February 15, 2026 23:53
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is ON. A Cloud Agent has been kicked off to fix the reported issue.

// Configure secret provider (use NoOpSecretProvider as fallback to avoid nil panics)
var secretProvider builds.SecretProvider
if cfg.BuildSecretsDir != "" {
secretProvider = builds.NewFileSecretProvider(cfg.BuildSecretsDir)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead fallback code due to changed defaults

Low Severity

The ProvideBuildManager fallbacks for BuilderImage == "", MaxConcurrentBuilds == 0, and DefaultTimeout == 0 are now unreachable dead code. The new defaultConfig() sets BuilderImage: "none", MaxConcurrentSourceBuilds: 2, and Timeout: 600, so these zero-value checks can never trigger. The BuilderImage fallback is particularly misleading — it would set the value to "hypeman/builder:latest" for an empty string, but the builds manager treats both "" and "none" as "auto-build from Dockerfile," creating a semantic inconsistency.

Fix in Cursor Fix in Web

@cursor
Copy link

cursor bot commented Feb 16, 2026

Bugbot Autofix prepared fixes for 1 of the 1 bugs found in the latest run.

  • ✅ Fixed: Dead fallback code due to changed defaults
    • Removed the unreachable fallback code for BuilderImage, MaxConcurrentBuilds, and DefaultTimeout since defaultConfig() now provides non-zero defaults, and the BuilderImage fallback was semantically incorrect (mapping empty to "hypeman/builder:latest" instead of treating it like "none" for auto-build).

Create PR

Or push these changes by commenting:

@cursor push 94264b92bf
Preview (94264b92bf)
diff --git a/lib/providers/providers.go b/lib/providers/providers.go
--- a/lib/providers/providers.go
+++ b/lib/providers/providers.go
@@ -288,17 +288,6 @@
 		RegistrySecret:      cfg.JwtSecret, // Use same secret for registry tokens
 	}
 
-	// Apply defaults if not set
-	if buildConfig.MaxConcurrentBuilds == 0 {
-		buildConfig.MaxConcurrentBuilds = 2
-	}
-	if buildConfig.BuilderImage == "" {
-		buildConfig.BuilderImage = "hypeman/builder:latest"
-	}
-	if buildConfig.DefaultTimeout == 0 {
-		buildConfig.DefaultTimeout = 600
-	}
-
 	// Configure secret provider (use NoOpSecretProvider as fallback to avoid nil panics)
 	var secretProvider builds.SecretProvider
 	if cfg.Build.SecretsDir != "" {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants