Getting Started
This guide gets you from an empty project to one passing Dynobox run.
Dynobox tests live in *.dyno.* files. A dyno describes a prompt, optional setup commands, one or more harnesses, and assertions about what the harness did while completing the task.
Prerequisites
- Node.js 22 or newer.
- At least one supported local harness:
claudefor Claude Code.codexfor OpenAI Codex.
The selected harness must be installed, authenticated, and available on PATH.
Install
Install the CLI:
npm install -g dynobox
Check that it is available:
dynobox --help
Create Your First Dyno
Use dynobox init to scaffold a starter scenario:
dynobox init
This writes dynobox/example.dyno.mjs. Run it with:
dynobox run
By default, dynobox run discovers every *.dyno.{mjs,js,ts,mts,yaml,yml} file under the current directory.
Choose A Harness
Each dyno can declare its own harness list. You can also override harnesses at runtime:
dynobox run --harness claude-code dynobox run --harness codex dynobox run --harness claude-code,codex
If neither the config nor the CLI selects a harness, Dynobox defaults to claude-code.
Author A Minimal Dyno
The example below asks the harness to inspect package.json and checks that it used a shell command, did not edit files, and mentioned the test script in the final answer.
import {artifact, defineDyno, finalMessage, tool} from '@dynobox/sdk';
export default defineDyno({
name: 'package-script-check',
harnesses: ['claude-code'],
scenarios: [
{
name: 'detects test script',
setup: [
`cat > package.json <<'JSON'
{
"name": "fixture",
"scripts": {"test": "vitest run"}
}
JSON`,
],
prompt:
'Inspect package.json and tell me whether this project has a test script.',
assertions: [
tool.called('shell', {includes: 'package.json'}),
tool.notCalled('edit_file'),
artifact.contains('package.json', 'vitest run'),
finalMessage.contains('test'),
],
},
],
});
The same dyno can be authored in YAML:
name: package-script-check
harnesses:
- claude-code
scenarios:
- name: detects test script
prompt: >-
Inspect package.json and tell me whether this project has a test script.
setup:
- |
cat > package.json <<'JSON'
{
"name": "fixture",
"scripts": {"test": "vitest run"}
}
JSON
assertions:
- kind: tool.called
toolKind: shell
matcher:
includes: package.json
- kind: tool.notCalled
toolKind: edit_file
- kind: artifact.contains
path: package.json
text: vitest run
- kind: finalMessage.contains
text: test
See Config Authoring for the full assertion reference.
Run A Specific Target
dynobox run [path] accepts:
- No argument: discover dynos recursively under the current directory.
- Directory path: discover dynos recursively under that directory.
- File path: run one loadable Dynobox config file.
Examples:
dynobox run dynobox run examples/local-observability dynobox run my-skill.dyno.yaml dynobox run dynobox.config.ts
Directory discovery skips hidden entries, node_modules, dist, build, coverage, .git, .dynobox, .next, and .cache. Explicit file paths do not need to match the *.dyno.* naming pattern, but they still need to be loadable JavaScript, TypeScript, or YAML Dynobox configs. .cjs and .cts configs are not supported.
Debug A Run
Use these flags while developing scenarios:
dynobox run --verbose dynobox run --debug dynobox run --reporter json
--debug includes each job's temporary work directory and writes debug logs when data is available:
dynobox-transcript.logdynobox-chat-history.jsonldynobox-tool-events.jsondynobox-stderr.log
Dynobox uses each harness's normal permission behavior by default. For trusted local evals that intentionally need full access, configure permissionMode: 'dangerous' in the dyno or pass:
dynobox run --permission-mode dangerous
Next Steps
- Write more scenarios with Config Authoring.
- Add Dynobox to automation with CI Integration.
- Check exact flags and output fields in the CLI Reference.