Optimization

optimize_anything is the Dreadnode-native frontend for text optimization. The first backend is GEPA, but the public surface stays in the SDK.

Create an optimization

import dreadnode as dn
from dreadnode.optimization import EngineConfig, OptimizationConfig

def evaluate(candidate: str, example: dict[str, str]) -> float:
    return 1.0 if example["expected"] in candidate else 0.0

optimization = dn.optimize_anything(
    seed_candidate="Answer the question directly.",
    dataset=[
        {"question": "What is Dreadnode?", "expected": "Dreadnode"},
        {"question": "What is GEPA?", "expected": "GEPA"},
    ],
    valset=[
        {"question": "Name the SDK.", "expected": "Dreadnode"},
    ],
    objective="Improve a short answer prompt for factual responses.",
    evaluator=evaluate,
    config=OptimizationConfig(
        engine=EngineConfig(max_metric_calls=50),
    ),
)

result: dn.OptimizationResult = await optimization.run()
print(result.best_score, result.best_candidate)

Key concepts

Optimization is the executor returned by dn.optimize_anything(...).
OptimizationConfig holds backend-neutral engine, reflection, merge, refiner, and tracking settings.
OptimizationStart, OptimizationError, and OptimizationEnd are the initial lifecycle events.
OptimizationEvaluation lets evaluators return a score plus side information and traces.

Adapter-backed runs

Use an adapter when you need GEPA’s full callback lifecycle, batched evaluation, or reflective datasets for agents and other multi-component systems.

Current adapter-mode support in Dreadnode:

Supported: reflection config, merge config, tracking config, stop_callbacks, and core engine settings such as max_metric_calls, seed, track_best_outputs, frontier_type, and val_evaluation_policy
Not yet supported: background, refiner, or adapter-mode engine knobs that GEPA’s api.optimize(...) does not expose in the current package surface

Adapter-backed streams can emit IterationStart, CandidateAccepted, CandidateRejected, ValsetEvaluated, BudgetUpdated, ParetoFrontUpdated, and OptimizationEnd.

Hosted Optimization Jobs

Hosted optimization is a separate control-plane path from local optimize_anything(...) runs. V1 is intentionally narrow:

backend: gepa
target: capability_agent
optimizable components: ["instructions"]
reward input: declarative RewardRecipe

Use the API client when you want the platform to provision the runtime and execute the job:

from dreadnode import Dreadnode
from dreadnode.app.api.models import (
    CapabilityRef,
    CreateGEPAOptimizationJobRequest,
    DatasetRef,
    RewardRecipe,
)

dn = Dreadnode().configure(
    server="http://localhost:3000",
    api_key="dn_...",
    organization="acme",
    workspace="research",
)

job = dn.client.create_optimization_job(
    "acme",
    "research",
    CreateGEPAOptimizationJobRequest(
        model="openai/gpt-4o-mini",
        capability_ref=CapabilityRef(name="assistant", version="1.2.0"),
        agent_name="assistant",
        dataset_ref=DatasetRef(name="acme/support-prompts", version="train-v1"),
        val_dataset_ref=DatasetRef(name="acme/support-prompts", version="val-v1"),
        reward_recipe=RewardRecipe(name="exact_match_v1"),
        components=["instructions"],
        objective="Improve answer quality without increasing verbosity.",
    ),
)

print(job.id, job.status)

Hosted optimization jobs now report structured progress while they run:

for capability-agent jobs, hosted GEPA defaults reflection_lm to the job model unless you override it explicitly
the sandboxed SDK runtime posts optimization_start, iteration, acceptance/rejection, validation, budget, frontier, error, and completion updates back to the control plane
the API persists both log entries and time-series metric samples on the job record
the frontend can subscribe to /optimization/jobs/{job_id}/events over SSE for live updates instead of polling only terminal job state
the Studio Optimization route lets you submit a hosted job from the UI, then uses those persisted metrics and SSE events to render live job status, frontier movement, and runtime logs
the Studio submission form now loads capability choices from the org + public capability catalog, exposes searchable capability and dataset pickers, loads versions lazily after selection, and uses the org dataset registry for both train and validation dataset choices
the best-candidate panel now resolves the source agent markdown, shows the original and optimized instruction bodies side by side, and can promote the selected candidate into a new private capability artifact version directly from the optimization page

Evaluator return values

Evaluators can return:

a single float
a dict[str, float] of named scores
an OptimizationEvaluation
a mapping with score, scores, side_info, evaluation_result, or traces

When only named scores are returned, Dreadnode derives the scalar score from their mean before passing it to the backend.