Draft
Conversation
a48a5dc to
8f9af3c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
New Configuration Reload Framework
TL;DR
New
traffic_ctlcommands:# Monitor mode with a custom token $ traffic_ctl config reload -t deploy-v2.1 -m ✔ Reload scheduled [deploy-v2.1] ✔ [deploy-v2.1] ████████████████████ 11/11 success (245ms)Failed reload — monitor mode:
$ traffic_ctl config reload -t hotfix-ssl-cert -m ✔ Reload scheduled [hotfix-ssl-cert] ✗ [hotfix-ssl-cert] ██████████████░░░░░░ 9/11 fail (310ms) Details : traffic_ctl config status -t hotfix-ssl-certFailed reload — status report:
$ traffic_ctl config status -t hotfix-ssl-cert ✗ Reload [fail] — hotfix-ssl-cert Started : 2025 Feb 17 14:30:10.500 Finished: 2025 Feb 17 14:30:10.810 Duration: 310ms ✔ 9 success ◌ 0 in-progress ✗ 2 failed (11 total) Tasks: ✔ ip_allow.yaml ·························· 18ms ✔ remap.config ··························· 42ms ✗ logging.yaml ·························· 120ms ✗ FAIL ✗ ssl_client_coordinator ················· 85ms ✗ FAIL ├─ ✔ sni.yaml ··························· 20ms └─ ✗ ssl_multicert.config ··············· 65ms ✗ FAIL ...Inline YAML reload (runtime only, not persisted to disk):
$ traffic_ctl config reload -d @ip_allow_new.yaml -t update-ip-rules -m ✔ Reload scheduled [update-ip-rules] ✔ [update-ip-rules] ████████████████████ 1/1 success (18ms) Note: Inline configuration is NOT persisted to disk. Server restart will revert to file-based configuration.The
-dflag accepts@filenameto read from a file, or@-to read from stdin. The YAML fileuses registry keys as top-level keys — the key string passed as the first argument to
register_config()orregister_record_config(). The content under each key is the actual YAMLthat the config file normally contains — it is passed as-is to the handler via
ctx.supplied_yaml().A single file can target multiple handlers:
New
traffic_ctlCommandstraffic_ctl config reloadtraffic_ctl config reload -mtraffic_ctl config reload -s -ltraffic_ctl config reload -t <token>traffic_ctl config reload -d @file.yamltraffic_ctl config reload -d @-traffic_ctl config reload --forcetraffic_ctl config statustraffic_ctl config status -t <token>traffic_ctl config status -c allAll commands support
--format jsonto output the raw JSONRPC response instead of human-readabletext. This is useful for automation, CI pipelines, or any tool that consumes structured output
directly:
$ traffic_ctl config status -t reload1 --format json {"tasks": [{"config_token": "reload1", "status": "success", "description": "Main reload task - 2026 Feb 18 19:46:02", ...}]}Full JSON output
{ "tasks": [ { "config_token": "reload1", "status": "success", "description": "Main reload task - 2026 Feb 18 20:03:10", "filename": "", "meta": { "created_time_ms": "1771444990585", "last_updated_time_ms": "1771444991015", "main_task": "true" }, "log": [], "sub_tasks": [ { "config_token": "reload1", "status": "success", "description": "ip_allow", "filename": "/opt/ats/etc/trafficserver/ip_allow.yaml", "meta": { "created_time_ms": "1771444991013", "last_updated_time_ms": "1771444991015", "main_task": "false" }, "log": [], "logs": [ "Finished loading" ], "sub_tasks": [] }, { "config_token": "reload1", "status": "success", "description": "ssl_ticket_key", "filename": "", "meta": { "created_time_ms": "1771444991015", "last_updated_time_ms": "1771444991015", "main_task": "false" }, "log": [], "logs": [ "SSL ticket key reloaded" ], "sub_tasks": [] } ] } ] }New JSONRPC APIs
admin_config_reloadconfigsparam is present. Params:token,force,configs.get_reload_config_statustokenor get the last N reloads viacount.Inline reload RPC example:
Background: Issues with the Previous Reload Mechanism
The previous configuration reload relied on a loose collection of independent record callbacks
(
RecRegisterConfigUpdateCb) wired throughFileManagerandAddConfigFilesHere.cc. Each configmodule registered its file independently, and reloads were fire-and-forget:
ran, or how long each one took.
a "reload session" grouping all config updates triggered by a single request.
to push YAML content at runtime through the RPC or CLI.
AddConfigFilesHere.cc(forFileManager) and individual modules (for record callbacks), making it hard to reason about whichfiles were tracked and which records triggered reloads.
status of a specific reload or distinguish between overlapping reloads.
What the New Design Solves
(
in_progress,success,fail) through aConfigContext. Results are aggregated into a tasktree with per-handler timings and logs.
ConfigRegistryis the single source of truth for all config files,their filename records, trigger records, and reload handlers.
ConfigSource::FileAndRpc) can receive YAMLcontent directly through the RPC, without writing to disk. This is runtime-only — the content
lives in memory and is lost on restart.
ReloadCoordinatormanages the lifecycle of each reload:token generation, concurrency control (
--forceto override), timeout detection, and history.traffic_ctl config reload -mshows a live progress bar.traffic_ctl config statusprovides a full post-mortem with task tree, durations, and failuredetails.
Basic Design
Key components:
ConfigRegistryFileManager.ReloadCoordinatorConfigReloadTaskConfigContextin_progress(),complete(),fail(),log(),supplied_yaml(), andadd_dependent_ctx(). Safe no-op at startup (no active reload task).ConfigReloadProgressTIMEOUT.Stuck reload checker:
ConfigReloadProgressis a periodic continuation scheduled onET_TASK. It monitors active reloadtasks and marks any that exceed the configured timeout as
TIMEOUT. This acts as a safety net forhandlers that fail to call
ctx.complete()orctx.fail()— for example, if a handler crashes,deadlocks, or its deferred thread never executes. The checker reads
proxy.config.admin.reload.timeoutdynamically at each interval, so the timeout can be adjusted at runtime without a restart. This is
a simple record read (
RecGetRecordString), not an expensive operation. Setting thetimeout to
"0"disables it (tasks will run indefinitely until completion).The checker is not a global poller — a new instance is created per-reload and self-terminates once
the task reaches a terminal state. No idle polling when no reload is in progress.
How Handlers Work
Before — scattered registration (ip_allow example):
Registration was split across multiple files with no centralized tracking:
Now — each module self-registers with full tracing:
Each module registers itself directly with
ConfigRegistry. No more separateAddConfigFilesHere.ccentry — the registry handles
FileManagerregistration, record callbacks, and status trackingautomatically:
Additional triggers can be attached from any module at any time:
Composite configs can declare file dependencies and dependency keys. For example,
SSLClientCoordinatorowns
sni.yamlandssl_multicert.configas children:Handler interaction with
ConfigContext:Each config module implements a C++ reload handler — the callback passed to
register_config().The handler reports progress through the
ConfigContext:When a reload fires, the handler receives a
ConfigContext:ctx.supplied_yaml()is undefined; the handler reads from its registered file on disk.ctx.supplied_yaml()contains the YAML node passed via--data/ RPC.The content is runtime-only and is never written to disk.
Handlers report progress:
Supplied YAML — inline content via
-d/ RPC:When a handler opts into
ConfigSource::FileAndRpc, it can receive YAML content directly insteadof reading from disk. The handler checks
ctx.supplied_yaml()to determine the source:For composite configs (e.g.,
SSLClientCoordinator), handlers create child contexts to trackeach sub-config independently. From
SSLClientCoordinator::reconfigure():The parent task automatically aggregates status from its children. In
traffic_ctl config status,this renders as a tree:
Design Challenges
1. Handlers must reach a terminal state — or the task hangs
The entire tracing model relies on handlers calling
ctx.complete()orctx.fail()beforereturning. If a handler returns without reaching a terminal state, the task stays
IN_PROGRESSindefinitely until the timeout checker marks it as
TIMEOUT.After
execute_reload()calls the handler, it checksctx.is_terminal()and emits a warningif the handler left the task in a non-terminal state:
The safety net:
ConfigReloadProgressruns periodically onET_TASKand marks stuck tasks asTIMEOUTafter the configured duration (proxy.config.admin.reload.timeout, default:1h).2. Parent status aggregation from sub-tasks
Parent tasks do not track their own status directly — they derive it from their children.
When a child calls
complete()orfail(), it notifies its parent, which re-evaluates:FAILIN_PROGRESSSUCCESSThis aggregation is recursive: a sub-task can have its own children (e.g.,
ssl_client_coordinator→sni+ssl_multicert), and status bubbles up through the tree.One subtle issue: if a handler creates child contexts but forgets to call
complete()orfail()on one of them, that child staysCREATEDand the parent never reachesSUCCESS.It is the handler developer's responsibility to ensure every
ConfigContext(and its children)reaches a terminal state (
complete()orfail()). The timeout checker is the ultimate safetynet for cases where this is not properly handled.
3. Startup vs. reload — same handler, different context
Handlers are called both at startup (initial config load) and during runtime reloads. At startup,
there is no active
ReloadCoordinatortask, so allConfigContextoperations (in_progress(),complete(),fail(),log()) are safe no-ops — they check_task.lock()and returnimmediately if the weak pointer is expired or empty.
This avoids having two separate code paths for startup vs. reload. The handler logic is identical
in both cases:
4. Known issue:
ssl_client_coordinatormay appear twice in reload statusssl_client_coordinatorregisters multiple trigger records and file dependencies, each wiring anindependent
on_record_changecallback with no deduplication. When several of these fire duringthe same reload (e.g., both
sni.yamlandssl_multicert.configchanged), the handler executesmore than once, producing duplicate entries in the reload status output.
This is a pre-existing issue present on
master— see #11724.5. Plugin support
Plugins are not supported by
ConfigRegistryin this PR. The legacy reload notificationmechanism (
TSMgmtUpdateRegister) still works — plugins registered through it will continueto be invoked via
FileManager::invokeConfigPluginCallbacks()during every reload cycle.A dedicated plugin API to let plugins register their own config handlers and participate in
the reload framework will be addressed in a separate PR.
Configs Migrated to
ConfigRegistryip_allowip_allow.yamlip_allow)ip_categories.yamlcache_controlcache.configcache_hostinghosting.configparent_proxyparent.configsplit_dnssplitdns.configremapremap.config¹logginglogging.yamlssl_client_coordinatorssl_client_coordinator)sni.yamlssl_client_coordinator)ssl_multicert.configssl_ticket_key¹ Remap migration will be refactored after #12813 (remap.yaml)
and #12669 (virtual hosts) land.
New Configuration Records
TODO
traffic_ctl config reload/config statuscommands and the JSONRPC APIs.ConfigRegistry. Remaining work: fully log detailed errors viactx.log(),ctx.fail(), etc.ConfigSource::FileOnlyand record-only handlers useConfigSource::RecordOnly. Migrate file-based handlers toFileAndRpcso they can read YAML directly from the RPC (viactx.supplied_yaml()).ConfigUpdateHandler/ConfigUpdateContinuationand the remainingregisterFile()calls inAddConfigFilesHere.cc.AddConfigFilesHere.ccintoConfigRegistry— Remaining static files (storage.config,volume.config,plugin.config, etc.) can be registered inConfigRegistryas inventory-only entries (no handler, no reload) to fully retireAddConfigFilesHere.cc.traffic_ctl config status -t <token>) instead of grepping log files.trigger_records/RecRegisterConfigUpdateCb) are not currently tracked. Create a main task with a synthetic token so they appear intraffic_ctl config status.Dependencies and Related Issues
Fixes #12324 — Improving
traffic_ctl config reload.This PR will likely land after:
There should be no major conflicts with those PRs. Conversation and coordination needs to be done before merging.