Architecture and internals¶
This page gives a high-level map of the repository for contributors and advanced users. It focuses on the structure that is clearly visible in the codebase.
Package layout¶
Package |
Responsibility |
|---|---|
|
CLI entrypoints for low-level workflows and cluster mode |
|
Training workflow and training loop |
|
Prediction workflow and export logic |
|
Evaluation workflow and metric export |
|
Dataset discovery, transforms, augmentations, and patching |
|
Model graph composition, optimizer/scheduler loaders, criterion routing |
|
Metrics, losses, and schedulers |
|
Config system, dataset helpers, distributed runtime utilities |
|
Standalone package for local/remote app execution and app server |
Two user-facing layers¶
KonfAI exposes two related but distinct usage layers.
1. Low-level framework mode¶
This is the konfai CLI:
TRAINRESUMEPREDICTIONEVALUATION
It operates directly on YAML files and is the best layer for experimentation, custom model development, and framework extension.
2. KonfAI Apps¶
This is the konfai-apps CLI and the remote server around it.
Apps package a stable workflow behind a simpler interface for:
local CLI usage
Python usage through
konfai_apps.KonfAIAppremote execution through
konfai-apps-serverexternal clients such as Slicer integrations
Configuration-driven construction¶
One of KonfAI’s main design choices is that the runtime is built from YAML using constructor signatures and type annotations.
The main pieces are:
@config("...")to bind a class to an explicit configuration key when neededapply_config(...)to instantiate objects recursivelyclasspathvalues to import custom implementations dynamically
The result is a constructor-driven configuration system rather than a fixed, static schema file.
Execution flow¶
At a high level, low-level workflows follow this path:
Parse CLI arguments in
konfai.mainSet runtime environment variables and output directories
Instantiate the root object (
Trainer,Predictor, orEvaluator)Build datasets, transforms, models, losses, and schedulers from YAML
Execute the workflow
Write checkpoints, logs, predictions, or metrics
See also Execution flow.
Model graph composition¶
The model system is not limited to a single monolithic network. In
konfai.network.network, a model can be composed from named modules with
explicit branch routing and criterion attachment.
This is what enables patterns such as:
multi-branch architectures
multiple output heads
nested discriminators and generators
patch-wise accumulation paths such as
;accu;
Dataset and patching layers¶
KonfAI has two patching levels visible in the codebase:
dataset patching in
konfai.data.patching.DatasetPatchmodel patching in
konfai.data.patching.ModelPatch
The synthesis GAN example uses both levels, which is why it is a good reference for advanced users.
Distributed runtime¶
Training, prediction, and evaluation are launched through the distributed
runtime utilities in konfai.utils.runtime.
This is an internal detail that matters operationally:
GPU discovery is centralized
runtime directories are injected through environment variables
PyTorch distributed initialization is part of the startup path
The exact control flow is documented only where it is directly visible in code; the repository does not currently ship a separate design document for the distributed launcher.
KonfAI Apps server architecture¶
The remote app server in konfai_apps.app_server adds:
upload handling
async job management
GPU semaphore-based scheduling
SSE log streaming
result packaging and download
optional bearer-token authentication
This layer is intentionally separate from the core low-level konfai CLI.