Skip to content

Optimization

optimize_anything is the Dreadnode-native frontend for text optimization. The first backend is GEPA, but the public surface stays in the SDK.

import dreadnode as dn
from dreadnode.optimization import EngineConfig, OptimizationConfig
def evaluate(candidate: str, example: dict[str, str]) -> float:
return 1.0 if example["expected"] in candidate else 0.0
optimization = dn.optimize_anything(
seed_candidate="Answer the question directly.",
dataset=[
{"question": "What is Dreadnode?", "expected": "Dreadnode"},
{"question": "What is GEPA?", "expected": "GEPA"},
],
valset=[
{"question": "Name the SDK.", "expected": "Dreadnode"},
],
objective="Improve a short answer prompt for factual responses.",
evaluator=evaluate,
config=OptimizationConfig(
engine=EngineConfig(max_metric_calls=50),
),
)
result: dn.OptimizationResult = await optimization.run()
print(result.best_score, result.best_candidate)
  • Optimization is the executor returned by dn.optimize_anything(...).
  • OptimizationConfig holds backend-neutral engine, reflection, merge, refiner, and tracking settings.
  • OptimizationStart, OptimizationError, and OptimizationEnd are the initial lifecycle events.
  • OptimizationEvaluation lets evaluators return a score plus side information and traces.

Use an adapter when you need GEPA’s full callback lifecycle, batched evaluation, or reflective datasets for agents and other multi-component systems.

Current adapter-mode support in Dreadnode:

  • Supported: reflection config, merge config, tracking config, stop_callbacks, and core engine settings such as max_metric_calls, seed, track_best_outputs, frontier_type, and val_evaluation_policy
  • Not yet supported: background, refiner, or adapter-mode engine knobs that GEPA’s api.optimize(...) does not expose in the current package surface

Adapter-backed streams can emit IterationStart, CandidateAccepted, CandidateRejected, ValsetEvaluated, BudgetUpdated, ParetoFrontUpdated, and OptimizationEnd.

Hosted optimization is a separate control-plane path from local optimize_anything(...) runs. V1 is intentionally narrow:

  • backend: gepa
  • target: capability_agent
  • optimizable components: ["instructions"]
  • reward input: declarative RewardRecipe

Use the API client when you want the platform to provision the runtime and execute the job:

from dreadnode import Dreadnode
from dreadnode.app.api.models import (
CapabilityRef,
CreateGEPAOptimizationJobRequest,
DatasetRef,
RewardRecipe,
)
dn = Dreadnode().configure(
server="http://localhost:3000",
api_key="dn_...",
organization="acme",
workspace="research",
)
job = dn.client.create_optimization_job(
"acme",
"research",
CreateGEPAOptimizationJobRequest(
model="openai/gpt-4o-mini",
capability_ref=CapabilityRef(name="assistant", version="1.2.0"),
agent_name="assistant",
dataset_ref=DatasetRef(name="acme/support-prompts", version="train-v1"),
val_dataset_ref=DatasetRef(name="acme/support-prompts", version="val-v1"),
reward_recipe=RewardRecipe(name="exact_match_v1"),
components=["instructions"],
objective="Improve answer quality without increasing verbosity.",
),
)
print(job.id, job.status)

Hosted optimization jobs now report structured progress while they run:

  • for capability-agent jobs, hosted GEPA defaults reflection_lm to the job model unless you override it explicitly
  • the sandboxed SDK runtime posts optimization_start, iteration, acceptance/rejection, validation, budget, frontier, error, and completion updates back to the control plane
  • the API persists both log entries and time-series metric samples on the job record
  • the frontend can subscribe to /optimization/jobs/{job_id}/events over SSE for live updates instead of polling only terminal job state
  • the Studio Optimization route lets you submit a hosted job from the UI, then uses those persisted metrics and SSE events to render live job status, frontier movement, and runtime logs
  • the Studio submission form now loads capability choices from the org + public capability catalog, exposes searchable capability and dataset pickers, loads versions lazily after selection, and uses the org dataset registry for both train and validation dataset choices
  • the best-candidate panel now resolves the source agent markdown, shows the original and optimized instruction bodies side by side, and can promote the selected candidate into a new private capability artifact version directly from the optimization page

Evaluators can return:

  • a single float
  • a dict[str, float] of named scores
  • an OptimizationEvaluation
  • a mapping with score, scores, side_info, evaluation_result, or traces

When only named scores are returned, Dreadnode derives the scalar score from their mean before passing it to the backend.