Tasks
Tasks are the unit of security challenge definition on Dreadnode. A task packages the instruction, environment, and verification rules for a challenge, but it is not itself a runtime session.
What a task contains
Section titled “What a task contains”A task definition includes:
- An instruction for the agent
- An environment, usually defined by
docker-compose.yaml - Service and port metadata used to expose the environment safely
- Verification rules that determine whether the challenge is solved
Tasks are reusable. The same task definition can be run interactively, or as part of a benchmark evaluation.
Task definitions are uploaded as OCI artifacts. The platform validates the bundled task.yaml and compose files, stores the archive, and records the task as pending. The provider-specific template or image is built lazily on first execution, then reused for later runs.
Runtime-facing task metadata
Section titled “Runtime-facing task metadata”Task API responses can include runtime-facing sandbox metadata:
sandbox_provider: which sandbox backend the task is prepared to run onsandbox_build_id: the catalog build record associated with the task environment, once a build exists
The sandbox build record is the authoritative place to inspect build lifecycle state such as queued, building, ready, or failed. Tasks remain the source artifact; builds represent the runnable environment derived from that artifact. Newly imported OCI tasks may not have a build record yet.
For interactive solving, the API also exposes a task-instruction rendering endpoint. When a caller supplies a sandbox provider ID, the platform resolves service placeholders such as {{ web_url }} against that sandbox’s reachable URLs and returns the rendered instruction.
Instruction rendering
Section titled “Instruction rendering”The tasks API supports fetching rendered instructions when you have a
running sandbox. Use GET /org/{org}/tasks/{name}/instruction and pass the
sandbox_id query parameter (the provider sandbox identifier) to resolve
template variables like {{ service_url }} with live connection details.
Without sandbox_id, the endpoint returns the raw instruction template.
What tasks do not own
Section titled “What tasks do not own”Tasks do not own:
- sandbox lifecycle
- attempts or sessions
- verification execution
- benchmark orchestration
- ZIP upload endpoints
Those concerns now live in the execution domains that reference the task definition.
How tasks are executed
Section titled “How tasks are executed”Interactive solving
Section titled “Interactive solving”For interactive work, the platform provisions a runtime. Runtimes are exposed through the workspace-scoped runtimes API, while the underlying runtime records remain visible in the sandboxes inventory.
Automated benchmarking
Section titled “Automated benchmarking”For judged and repeatable runs, the platform creates an evaluation. Each evaluation item combines the task environment with a runtime sandbox, runs the agent inside that runtime, and then executes the task’s verification rules. If the task has never been built for the active sandbox provider, the first run triggers that build before the environment sandbox is provisioned.
Verification
Section titled “Verification”Verification remains part of the task definition. The task defines what success means, but the execution domain performs the actual verification step.