konfai package#

Subpackages#

Submodules#

konfai.evaluator module#

Evaluation workflow classes and helpers for KonfAI.

class konfai.evaluator.CriterionsLoader(criterions_loader={'default|torch:nn:CrossEntropyLoss|Dice|NCC': None})[source]#

Bases: object

Loader for multiple criterion modules to be applied between a model output and one or more targets.

Each loss module (e.g., Dice, CrossEntropy, NCC) is dynamically loaded using its fully-qualified classpath. Evaluation criteria carry no per-criterion attributes, so the config value bound to each classpath is an unused placeholder (None).

Parameters:: criterions_loader (dict[str, Any]) – A mapping from module classpaths (as strings) to placeholder values. The module path is parsed and instantiated via get_module.

get_criterions(output_group, target_group)[source]#

Return type:: dict[Module, Any]

class konfai.evaluator.TargetCriterionsLoader(targets_criterions={'default': <konfai.evaluator.CriterionsLoader object>})[source]#

Bases: object

Loader class for handling multiple target groups with associated criterion configurations.

This class allows defining a set of criterion loaders (e.g., Dice, BCE, MSE) for each target group to be used during evaluation or training. Each target group corresponds to one or more loss functions, all linked to a specific model output.

Parameters:: targets_criterions (dict[str, CriterionsLoader]) – Dictionary mapping each target group name to a CriterionsLoader instance that defines its associated loss functions.

get_targets_criterions(output_group)[source]#

Retrieve the criterion modules and their attributes for a specific output group.

This function prepares the loss functions to be applied for a given model output, grouped by their target group.

Parameters:: output_group (str) – Name of the model output group (e.g., “output_segmentation”).
Returns:: A nested dictionary where the first key is the target group name, and the value is a dictionary mapping each loss module to its placeholder.
Return type:: dict[str, dict[Module, Any]]

class konfai.evaluator.Statistics(filename)[source]#

Bases: object

Utility class to accumulate, structure, and write evaluation metric results.

This class is used to: - Collect metrics for each dataset sample. - Compute aggregate statistics (mean, std, percentiles, etc.). - Export all results in a structured JSON format, including both per-case and aggregate values.

Parameters:: filename (Path) – Path to the output JSON file that will store the final results.

add(values, name_dataset)[source]#

Add a set of metric values for a given dataset case.

Parameters:

values (dict[str, float]) – Dictionary of metric names and their values.
name_dataset (str) – Identifier (e.g., case name) for the sample.

Return type:

None

static get_statistic(values)[source]#

Compute statistical aggregates for a list of metric values.

Parameters:

values (list[float]) – Values to summarize.

Returns:

A dictionary containing:

max, min, std
25th, 50th, and 75th percentiles
mean and count

Return type:

dict[str, float]

write(outputs)[source]#

Write the collected and aggregated statistics to the configured output file.

The output JSON structure contains: - case: All individual metrics per sample. - aggregates: Global statistics computed over all cases.

Parameters:: outputs (list[dict[str, dict[str, Any]]]) – List of metric dictionaries to merge and serialize.
Return type:: None

read()[source]#

Return type:: dict[str, float]

class konfai.evaluator.Evaluator(train_name='default|TRAIN_01', metrics={'default': <konfai.evaluator.TargetCriterionsLoader object>}, dataset={'dataset_filenames': ['default|./Dataset:mha'], 'groups_src': {'default': {'default|group_dest': {'transforms': [], 'patch_transforms': []}}}, 'patch': None, 'use_cache': False, 'memory_budget': None, 'subset': <konfai.data.data_manager.PredictionSubset object>, 'batch_size': 1, 'validation': None, 'validation_augmentations': True, 'inline_augmentations': False, 'data_augmentations_list': {}})[source]#

Bases: DistributedObject

Distributed evaluation engine for computing metrics on model predictions.

This class handles the evaluation of predicted outputs using predefined metric loaders. It supports multi-output and multi-target configurations, computes aggregated statistics across training and validation datasets, and synchronizes results across processes.

Evaluation results are stored in JSON format and optionally displayed during iteration.

Parameters:

train_name (str) – Unique name of the evaluation run, used for logging and output folders.
metrics (dict[str, TargetCriterionsLoader]) – Dictionary mapping output groups to loaders of target metrics.
dataset (DataMetric) – Dataset provider configured for evaluation mode.

statistics_train#

Object used to store training evaluation metrics.

Type:: Statistics

statistics_validation#

Object used to store validation evaluation metrics.

Type:: Statistics

dataloader#

DataLoaders for training and validation sets.

Type:: list[DataLoader]

metric_path#

Path to the evaluation output directory.

Type:: str

metrics#

Instantiated metrics organized by output and target groups.

Type:: dict

setup(world_size)[source]#

Prepare the evaluator for distributed metric computation.

This method performs the following steps: - Checks whether previous evaluation results exist and optionally overwrites them. - Creates the output directory and copies the current configuration file for reproducibility. - Loads the evaluation dataset according to the world size.

Parameters:: world_size (int) – Number of processes in the distributed evaluation setup.

update(batch_sample, statistics)[source]#

Compute metrics for a batch and update running statistics.

Parameters:

batch_sample (dict[str, BatchDataItem]) – The batch sample object containing tensors and their metadata.
statistics (Statistics) – The statistics object to update (train or validation).

Returns:

Dictionary of computed metric values with keys in the format: ’output_group:target_group:MetricName’.

Return type:

dict[str, float]

run_process(world_size, global_rank, gpu, dataloaders)[source]#

Execute the distributed evaluation loop over the training and validation datasets.

This method iterates through the provided DataLoaders (train and optionally validation), updates the metric statistics using the configured metrics dictionary, and synchronizes the results across all processes. On the global rank 0, the metrics are saved as JSON files.

Metrics are displayed in real-time using tqdm progress bars, showing a summary of the current batch’s computed values.

Parameters:

world_size (int) – Total number of distributed processes.
global_rank (int) – Global rank of the current process (used for writing results).
gpu (int) – Local GPU ID used for synchronization.
dataloaders (list[DataLoader]) – A list containing one or two DataLoaders: - dataloaders[0] is used for training evaluation. - dataloaders[1] (optional) is used for validation evaluation.

Notes

Only the main process (global_rank == 0) writes final results to disk.

konfai.evaluator.build_evaluate(evaluations_file=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/konfai/checkouts/latest/docs/source/Evaluation.yml'), evaluations_dir=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/konfai/checkouts/latest/docs/source/Evaluations'))[source]#

Build and return the configured evaluation workflow without executing it.

Parameters:

evaluations_file (Path | str) – Evaluation configuration file.
evaluations_dir (Path | str) – Directory where metrics and JSON reports are written.

Returns:

Configured evaluator object ready to be executed by the runtime wrapper.

Return type:

DistributedObject

konfai.evaluator.evaluate(overwrite=False, gpu=[], cpu=1, quiet=False, tensorboard=False, evaluations_file=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/konfai/checkouts/latest/docs/source/Evaluation.yml'), evaluations_dir=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/konfai/checkouts/latest/docs/source/Evaluations'))[source]#

Build and execute the configured evaluation workflow.

This compatibility wrapper preserves the historical CLI-facing API while delegating the pure build step to build_evaluate().

Return type:: DistributedObject

konfai.main module#

Command-line entrypoints for KonfAI workflows, apps, and services.

konfai.main.main()[source]#

Entry point for the konfai command-line interface.

This function builds the top-level CLI parser and delegates the full argument parsing and command dispatching to _run(parser).

Supported commands are: - TRAIN - RESUME - PREDICTION - EVALUATION

Notes

The actual execution logic is implemented in konfai.trainer.train, konfai.predictor.predict, and konfai.evaluator.evaluate.

konfai.main.cluster()[source]#

Entry point for running KonfAI with cluster-oriented CLI arguments.

This command extends the standard KonfAI CLI with a “Cluster manager arguments” group (job name, nodes, memory, time limit, resubmit), then delegates parsing and command dispatching to _run(parser).

Notes

This function only defines extra CLI arguments before delegating to _run.

konfai.predictor module#

Prediction workflow classes, reductions, and export helpers for KonfAI.

class konfai.predictor.Reduction[source]#

Bases: ABC

Aggregate a list of predictions (one per model in an ensemble, or per TTA augmentation) into one.

A Reduction is a KonfAI extension point: subclass it and reference it by classpath in the OutputDataset config. __call__ receives the stacked predictions and returns the aggregate.

voxel_local: bool = False#

Streamed-write contract. True declares this reduction a pure per-voxel operation over the model/TTA (stack) axis — every output voxel depends only on the SAME voxel of each input, never on a spatial neighbour. The streamed-write gate reads this flag to decide whether the finalize chain may run slab by slab.

Rules for a custom reduction: - Default False. Leave it False unless you are sure: an unknown reduction then takes the

whole-volume path, costing the streaming optimisation but never correctness.

Set True only if voxel-local. Reducing/stacking along the stack axis (dim 0) or the channel axis (dim 1) is fine — those are orthogonal to the spatial slab axis. Anything that reads across spatial positions (a spatial blur, a resample, a global-argmax over Z) must stay False.
A wrong True corrupts the streamed output (each slab would be reduced with only its own data); the gate trusts this flag and checks nothing else.

class konfai.predictor.Mean[source]#

Bases: Reduction

Average ensemble or augmentation predictions element-wise.

voxel_local: bool = True#

Rules for a custom reduction: - Default False. Leave it False unless you are sure: an unknown reduction then takes the

whole-volume path, costing the streaming optimisation but never correctness.

Set True only if voxel-local. Reducing/stacking along the stack axis (dim 0) or the channel axis (dim 1) is fine — those are orthogonal to the spatial slab axis. Anything that reads across spatial positions (a spatial blur, a resample, a global-argmax over Z) must stay False.
A wrong True corrupts the streamed output (each slab would be reduced with only its own data); the gate trusts this flag and checks nothing else.

class konfai.predictor.Median[source]#

Bases: Reduction

Compute the element-wise median across prediction tensors.

voxel_local: bool = True#

Rules for a custom reduction: - Default False. Leave it False unless you are sure: an unknown reduction then takes the

whole-volume path, costing the streaming optimisation but never correctness.

Set True only if voxel-local. Reducing/stacking along the stack axis (dim 0) or the channel axis (dim 1) is fine — those are orthogonal to the spatial slab axis. Anything that reads across spatial positions (a spatial blur, a resample, a global-argmax over Z) must stay False.
A wrong True corrupts the streamed output (each slab would be reduced with only its own data); the gate trusts this flag and checks nothing else.

class konfai.predictor.Concat[source]#

Bases: Reduction

Concatenate prediction tensors along the channel dimension.

voxel_local: bool = True#

Rules for a custom reduction: - Default False. Leave it False unless you are sure: an unknown reduction then takes the

whole-volume path, costing the streaming optimisation but never correctness.

Set True only if voxel-local. Reducing/stacking along the stack axis (dim 0) or the channel axis (dim 1) is fine — those are orthogonal to the spatial slab axis. Anything that reads across spatial positions (a spatial blur, a resample, a global-argmax over Z) must stay False.
A wrong True corrupts the streamed output (each slab would be reduced with only its own data); the gate trusts this flag and checks nothing else.

class konfai.predictor.OutputDataset(filename, group, before_reduction_transforms, after_reduction_transforms, final_transforms, patch_combine, reduction)[source]#

Bases: Dataset, NeedDevice, ABC

Abstract prediction sink that accumulates model outputs and writes them to disk.

Concrete subclasses define how layers are accumulated across patches, augmentations, and multiple models before the final prediction volume is materialized.

reduction: Reduction#

before_reduction_transforms: list[Transform]#

after_reduction_transforms: list[Transform]#

final_transforms: list[Transform]#

patch_combine: PathCombine | None#

output_layer_accumulator: dict[int, dict[int, Accumulator]]#

attributes: dict[int, dict[int, dict[int, Attribute]]]#

names: dict[int, str]#

finalize_writes()[source]#

Drain and stop the background writer; every submitted write is on disk when this returns.

Return type:: None

prepare(name_layer)[source]#

Return type:: None

set_datasets(datasets)[source]#

Return type:: None

abstractmethod setup(datasets, groups)[source]#

set_patch_config(patch_size, overlap, nb_data_augmentation)[source]#

Return type:: None

to(device)[source]#

abstractmethod add_layer(index_dataset, index_augmentation, index_patch, layer, dataset, attribute=None, number_of_channels_per_model=None)[source]#

is_done(index)[source]#

Return type:: bool

abstractmethod get_output(index, number_of_channels_per_model, dataset)[source]#

Return type:: Tensor

write_prediction(index, name, layer)[source]#

Return type:: None

reset()[source]#

Drop every in-flight accumulation (the OOM-restart path re-runs the rank’s cases from scratch).

Return type:: None

class konfai.predictor.OutSameAsGroupDataset(same_as_group='default', dataset_filename='default|./Dataset:mha', group='default', before_reduction_transforms={'default|Normalize': <konfai.data.transform.TransformLoader object>}, after_reduction_transforms={'default|Normalize': <konfai.data.transform.TransformLoader object>}, final_transforms={'default|Normalize': <konfai.data.transform.TransformLoader object>}, patch_combine=None, reduction='Mean')[source]#

Bases: OutputDataset

Output dataset that mirrors the geometry and transform chain of an input group.

This is the default output writer used by KonfAI prediction workflows.

add_layer(index_dataset, index_augmentation, index_patch, layer, dataset, attribute=None, number_of_channels_per_model=None)[source]#

reset()[source]#

Drop every in-flight accumulation (the OOM-restart path re-runs the rank’s cases from scratch).

Return type:: None

setup(datasets, groups)[source]#

get_output(index, number_of_channels_per_model, dataset)[source]#

Return type:: Tensor

class konfai.predictor.OutputDatasetLoader(name_class='OutSameAsGroupDataset')[source]#

Bases: object

Factory that instantiates output dataset classes from predictor config.

get_output_dataset(layer_name)[source]#

Return type:: OutputDataset

class konfai.predictor.ModelComposite(model, combine)[source]#

Bases: Network

A composite model that replicates a given base network multiple times and combines their outputs.

This class is designed to handle model ensembles or repeated predictions from the same architecture. It creates nb_models deep copies of the input model, each with its own name and output branch, and aggregates their outputs using a provided Reduction strategy (e.g., mean, median).

Parameters:

model (Network) – The base network to replicate.
nb_models (int) – Number of copies of the model to create.
combine (Reduction) – The reduction method used to combine outputs from all model replicas.

combine#

The reduction method used during forward inference.

Type:: Reduction

load(state_sources)[source]#

Load weights for each sub-model in the composite from the corresponding state dictionaries.

Parameters:: state_sources (list[dict[str, Any] | Path | str]) – One checkpoint source per model replica. Empty ONLY for a weightless model (0 parameters), which is then run once with its constructed weights; empty sources for a model that has trainable parameters is refused here, so a caller cannot silently run random weights.

forward(data_dict, output_layers=[])[source]#

Perform a forward pass on all model replicas and aggregate their outputs.

Parameters:

data_dict (dict[tuple[str, bool], Tensor]) – A dictionary mapping (group_name, requires_grad) to input tensors.
output_layers (list[str]) – List of output layer names to extract from each sub-model.

Returns:

Aggregated output for each layer, after applying the reduction.

Return type:

list[tuple[str, list[int], Tensor]]

class konfai.predictor.Predictor(model=<konfai.network.network.ModelLoader object>, dataset={'dataset_filenames': ['default|./Dataset'], 'groups_src': {'default': {'default|Labels': {'transforms': [], 'patch_transforms': []}}}, 'patch': <konfai.data.patching.DatasetPatch object>, 'use_cache': False, 'memory_budget': None, 'subset': <konfai.data.data_manager.PredictionSubset object>, 'batch_size': 1, 'validation': None, 'validation_augmentations': True, 'inline_augmentations': False, 'data_augmentations_list': {'DataAugmentation_0': <konfai.data.augmentation.DataAugmentationsList object>}}, combine='Mean', train_name='name', manual_seed=None, gpu_checkpoints=None, autocast=False, outputs_dataset={'default|Default': <konfai.predictor.OutputDatasetLoader object>}, data_log=None)[source]#

Bases: DistributedObject

KonfAI’s main prediction controller.

This class orchestrates the prediction phase by: - Loading model weights from checkpoint(s) or URL(s) - Preparing datasets and output configurations - Managing distributed inference with optional multi-GPU support - Applying transformations and saving predictions - Optionally logging results to TensorBoard

model#

The neural network model to use for prediction.

Type:: Network

dataset#

Dataset manager for prediction data.

Type:: DataPrediction

combine_classpath#

Path to the reduction strategy (e.g., “Mean”).

Type:: str

autocast#

Whether to enable AMP inference.

Type:: bool

outputs_dataset#

Mapping from layer names to output writers.

Type:: dict[str, OutputDataset]

data_log#

List of tensors to log during inference.

Type:: list[str] | None

setup(world_size)[source]#

Set up the predictor for inference.

This method performs all necessary initialization steps before running predictions: - Ensures output directories exist, and optionally prompts the user before overwriting existing predictions. - Copies the current configuration file (Prediction.yml) into the output directory for reproducibility. - Dynamically loads pretrained weights from local files or remote URLs. - Wraps the base model into a ModelComposite to support ensemble inference. - Initializes the prediction dataloader, with proper distribution across available GPUs.

Parameters:: world_size (int) – Total number of processes or GPUs used for distributed prediction.

set_models(path_to_models)[source]#

Return type:: None

run_process(world_size, global_rank, local_rank, dataloaders)[source]#

Launch prediction on the given process rank.

Parameters:

world_size (int) – Number of model replicas sharding the data – the spawned process count already divided by the model-parallel size (gpu_checkpoints), NOT the GPU count.
global_rank (int) – Rank of the current process.
local_rank (int) – Local device rank.
dataloaders (list[DataLoader]) – List of data loaders for prediction.

konfai.predictor.build_predict(models, prediction_file=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/konfai/checkouts/latest/docs/source/Prediction.yml'), predictions_dir=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/konfai/checkouts/latest/docs/source/Predictions'))[source]#

Build and return the configured prediction workflow without executing it.

Parameters:

models (list[Path]) – One or more checkpoint files to load for prediction.
prediction_file (Path | str) – Prediction configuration file.
predictions_dir (Path | str) – Directory where prediction outputs are written.

Returns:

Configured predictor object ready to be executed by the runtime wrapper.

Return type:

DistributedObject

konfai.predictor.predict(models, overwrite=False, gpu=[], cpu=1, quiet=False, tensorboard=False, prediction_file=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/konfai/checkouts/latest/docs/source/Prediction.yml'), predictions_dir=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/konfai/checkouts/latest/docs/source/Predictions'))[source]#

Build and execute the configured prediction workflow.

This compatibility wrapper preserves the historical CLI-facing API while delegating the pure build step to build_predict().

Return type:: DistributedObject

konfai.trainer module#

Training workflow entrypoints and orchestration for KonfAI.

class konfai.trainer.EarlyStoppingBase[source]#

Bases: object

Minimal protocol for early stopping strategies used by Trainer.

mode: str = 'min'#

is_stopped()[source]#

Return type:: bool

get_score(values)[source]#

is_better(score, reference)[source]#

True if score is strictly better than reference under this monitor’s direction.

Return type:: bool

property worst_score: float#: The worst possible score for this direction (a starting sentinel for best-tracking).

stop()[source]#

Return type:: None

class konfai.trainer.EarlyStopping(monitor=None, patience=10, min_delta=0.0, mode='min')[source]#

Bases: EarlyStoppingBase

Implements early stopping logic with configurable patience and monitored metrics.

monitor#

Metrics to monitor.

Type:: list[str]

patience#

Number of checks with no improvement before stopping.

Type:: int

min_delta#

Minimum change to qualify as improvement.

Type:: float

mode#

“min” or “max” depending on optimization direction.

Type:: str

best_score: float | None#

get_score(values)[source]#

class konfai.trainer.Trainer(model=<konfai.network.network.ModelLoader object>, dataset={'dataset_filenames': ['default|./Dataset:mha'], 'groups_src': {'default|Labels': {'default|Labels': {'transforms': [], 'patch_transforms': []}}}, 'patch': <konfai.data.patching.DatasetPatch object>, 'use_cache': True, 'memory_budget': None, 'subset': <konfai.data.data_manager.TrainSubset object>, 'batch_size': 1, 'validation': 0.2, 'validation_augmentations': True, 'inline_augmentations': False, 'data_augmentations_list': {'DataAugmentation_0': <konfai.data.augmentation.DataAugmentationsList object>}}, train_name='default|TRAIN_01', manual_seed=None, epochs=100, it_validation=None, it_lr_update=None, autocast=False, gradient_checkpoints=None, gpu_checkpoints=None, ema_decay=0, data_log=None, early_stopping=None, save_checkpoint_mode='BEST')[source]#

Bases: DistributedObject

Public API for training a model using the KonfAI framework. Wraps setup, checkpointing, resuming, logging, and launching distributed _Trainer.

Main responsibilities: - Initialization from config (via @config) - Model and EMA setup - Checkpoint loading and saving - Distributed setup and launch

Parameters:

model (ModelLoader) – Loader for model architecture.
dataset (DataTrain) – Training/validation dataset.
train_name (str) – Training session name.
manual_seed (int | None) – Random seed.
epochs (int) – Number of epochs to run.
it_validation (int | None) – Validation interval.
it_lr_update (int | None) – Learning rate update interval.
autocast (bool) – Enable AMP training.
gradient_checkpoints (list[str] | None) – Modules to use gradient checkpointing on.
gpu_checkpoints (list[str] | None) – Modules to pin on specific GPUs.
ema_decay (float) – EMA decay factor.
data_log (list[str] | None) – Logging instructions.
early_stopping (EarlyStopping | None) – Optional early stopping config.
save_checkpoint_mode (str) – Either “BEST” or “ALL”.

override_lr: float | None#

model_ema: AveragedModel | None#

setup(world_size)[source]#

Initializes the training environment: - Clears previous outputs (unless resuming) - Initializes model and EMA - Loads checkpoint (if resuming) - Prepares dataloaders

Parameters:: world_size (int) – Total number of distributed processes.

set_model(path_to_model)[source]#

Return type:: None

set_lr(lr)[source]#

Return type:: None

run_process(world_size, global_rank, local_rank, dataloaders)[source]#

Launches the actual training process via internal _Trainer class. Wraps model with DDP or CPU fallback, attaches EMA, and starts training.

Parameters:

world_size (int) – Number of model replicas sharding the data – the spawned process count already divided by the model-parallel size (gpu_checkpoints), NOT the GPU count.
global_rank (int) – Global rank of the current process.
local_rank (int) – Local rank within the node.
dataloaders (list[DataLoader]) – Training and validation dataloaders.

konfai.trainer.build_train(command=State.TRAIN, model=None, config=PosixPath('Config.yml'), checkpoints_dir=PosixPath('Checkpoints'), statistics_dir=PosixPath('Statistics'), lr=None)[source]#

Build and return the configured training workflow without executing it.

Parameters:

command (State) – Training command variant, typically State.TRAIN or State.RESUME.
model (Path | str | None) – Checkpoint path used when resuming training.
config (Path | str) – Training configuration file.
checkpoints_dir (Path | str) – Output directory for checkpoints.
statistics_dir (Path | str) – Output directory for statistics and logs.
lr (float | None) – Runtime learning-rate override applied when resuming/fine-tuning. When None the checkpoint learning rate is resumed and the scheduler continues; when set, the learning rate restarts from this value.

Returns:

Configured trainer object ready to be executed by the runtime wrapper.

Return type:

DistributedObject

konfai.trainer.train(command=State.TRAIN, overwrite=False, model=None, gpu=[], cpu=None, quiet=False, tensorboard=False, config=PosixPath('Config.yml'), checkpoints_dir=PosixPath('Checkpoints'), statistics_dir=PosixPath('Statistics'), lr=None)[source]#

Build and execute the configured training workflow.

This compatibility wrapper preserves the historical CLI-facing API while delegating the pure build step to build_train().

Return type:: DistributedObject

Module contents#

Top-level helpers and runtime utilities exposed by the KonfAI package.

konfai.checkpoints_directory()[source]#

Return the configured checkpoint output directory.

Return type:: Path

konfai.predictions_directory()[source]#

Return the configured prediction output directory.

Return type:: Path

konfai.evaluations_directory()[source]#

Return the configured evaluation output directory.

Return type:: Path

konfai.statistics_directory()[source]#

Return the configured statistics output directory.

Return type:: Path

konfai.config_file()[source]#

Return the active configuration file used by the current workflow.

Return type:: Path

konfai.konfai_state()[source]#

Return the current KonfAI workflow state stored in the environment.

Return type:: str

konfai.konfai_root()[source]#

Return the root configuration section name for the current workflow.

Return type:: str

class konfai.RemoteServer(host, port, token)[source]#

Bases: object

Connection settings for a remote KonfAI Apps server.

get_headers()[source]#

Return the HTTP headers required to talk to the remote server.

Return type:: dict[str, str]

get_url()[source]#

Return the base URL of the remote server.

Return type:: str

konfai.cuda_visible_devices()[source]#

Return the GPU indices visible to the current process.

Returns:: GPU ids exposed through CUDA_VISIBLE_DEVICES or detected by PyTorch.
Return type:: list[int]

konfai.get_available_devices(remote_server=None, timeout_s=2.0)[source]#

Return the available GPU indices and their display names.

Parameters:

remote_server (RemoteServer | None) – Remote server to query instead of the local machine.
timeout_s (float) – HTTP timeout used for remote requests.

Returns:

Available device indices and the corresponding device names.

Return type:

tuple[list[int], list[str]]

konfai.get_ram(remote_server=None, timeout_s=2.0)[source]#

Return used and total RAM in gigabytes.

Parameters:

remote_server (RemoteServer | None) – Remote server to query instead of the local machine.
timeout_s (float) – HTTP timeout used for remote requests.

Returns:

Used RAM and total RAM in gigabytes.

Return type:

tuple[float, float]

konfai.get_vram(devices, remote_server=None, timeout_s=2.0)[source]#

Return used and total VRAM in gigabytes for the selected devices.

Parameters:

devices (list[int]) – GPU indices to inspect.
remote_server (RemoteServer | None) – Remote server to query instead of the local machine.
timeout_s (float) – HTTP timeout used for remote requests.

Returns:

Used VRAM and total VRAM in gigabytes.

Return type:

tuple[float, float]

konfai.current_date()[source]#

Return the current timestamp formatted for KonfAI output folders.

Return type:: str

konfai.check_server(remote_server, timeout_s=2.0)[source]#

Check whether a remote KonfAI Apps server is reachable and healthy.

Parameters:

remote_server (RemoteServer) – Remote server connection settings.
timeout_s (float) – HTTP timeout used for the health check.

Returns:

A boolean success flag and a human-readable status message.

Return type:

tuple[bool, str]

konfai.check_konfai_install()[source]#

Checks that KonfAI dependencies are importable.

Returns:: A pair containing a global success flag and a report dictionary with the keys missing, errors, and versions.
Return type:: tuple[bool, dict]

exception konfai.KonfAIPackagesError[source]#

Bases: RuntimeError

Raised when required Python packages for KonfAI are missing/broken.

konfai.assert_konfai_install()[source]#

Raise KonfAIPackagesError if the KonfAI dependency check fails.

Return type:: None