🎚️ Crescendo (Cres)

Crescendo is a multi-turn, refusal-aware prompting tactic. Instead of throwing one clever prompt at the target, Crescendo asks a short sequence of increasingly targeted questions, carries forward the conversational context, and backtracks when the target refuses. Optionally, a judge model scores each round so you can pick the best run.

Conceptually:

A (Question Generator) proposes the next single question (and a 1-line status) given the task, history, and last answer.
T (Target) answers each question with full conversation context.
Backtracking: if T refuses, Crescendo doesn’t commit that turn—tries a different question (up to a limit).
J (Judge, optional) scores each round; the run’s score is typically the average of per-round scores.
Best-of-N: repeat for N iterations and keep the best run by score.

Crescendo is not a jailbreak method by itself; it’s a structured prompting strategy to reach allowed objectives via small, steerable steps.

⚡ Quick Start

Run Crescendo from the unified CLI with minimal arguments:

dtx attack run --attack CRESCENDO \
  --task "Explain orbital rendezvous in simple steps."

Recommended (adds a numeric score and a proper Best Score bar in the dashboard):

dtx attack run --attack CRESCENDO \
  --task "Explain orbital rendezvous in 6 steps." \
  --eval-model gpt-4o-mini \
  --judge-template chao2023pair

Environment:

Set OPENAI_API_KEY for the answer model (target).
If you choose a Groq-only question model (e.g. deepseek-r1-distill-llama-70b), set GROQ_API_KEY and use --cres-questions-provider groq (or auto if you want it to prefer Groq when available).

🖼️ Workflow

Crescendo Workflow

Initialize Choose N iterations and R rounds. Provide your task (the overall goal).
Generate Next Question A produces (q, s) where q is the next single question and s is a one-sentence status. It conditions on the task, attack history (previous q/s/e/refusals), and last target response.
Ask Target Build a conversation from committed history and ask T:
- If T refuses and you’re under the refusal limit, backtrack (do not commit the turn) and try a different question.
- Else commit the Q/A to history.
(Optional) Judge J scores this round (e.g., 0–10 mapped from 0–1). Keep a per-round score list.
Finish Iteration Aggregate scores (e.g., average of rounds). Track the iteration’s best preview and score for the dashboard.
Select Best After N iterations (or early success), return the best run: its questions, final answer, and score.

🔧 Advanced CLI Options (CRESCENDO)

CLI flag	Meaning	Notes
`--cres-question-model NAME`	Model for question generation (A)	Groq or OpenAI model name
`--cres-questions-provider {auto,groq,openai}`	Backend for the question model	`auto` prefers Groq when `GROQ_API_KEY` exists
`--cres-answer-model NAME`	Target model (T)	OpenAI model name
`--cres-iterations N`	Iterations (outer loop)	Best-of-N selection
`--cres-rounds R`	Rounds per iteration	Max Q/A turns per run
`--cres-refusal-limit K`	Backtrack attempts per iteration	Avoids burning turns on refusals
`--cres-qgen-temperature T`	Question generator temperature	Defaults sensible if omitted
`--cres-qgen-max-new-tokens N`	Question generator token cap	—
`--cres-answer-temperature T`	Target temperature	—
`--cres-answer-max-new-tokens N`	Target token cap	—
`--eval-model NAME`	Judge model (optional)	Enables numeric Best Score
`--judge-template NAME`	Judge rubric template	e.g., `chao2023pair`

Example with knobs

dtx attack run --attack CRESCENDO \
  --task "Explain orbital rendezvous in 6 steps." \
  --cres-question-model deepseek-r1-distill-llama-70b \
  --cres-questions-provider auto \
  --cres-answer-model gpt-4o-mini \
  --cres-iterations 3 --cres-rounds 6 --cres-refusal-limit 2 \
  --cres-qgen-temperature 0.25 --cres-qgen-max-new-tokens 512 \
  --cres-answer-temperature 0.7 --cres-answer-max-new-tokens 250 \
  --eval-model gpt-4o-mini --judge-template chao2023pair

When to Use Crescendo vs. TAP / PAIR / BON

Crescendo: You want a guided, conversational path to the goal with refusal-aware backtracking and context carryover.
PAIR: You prefer iterative prompt refinements on relatively linear streams (no conversational Q/A chain).
TAP: You need deeper tree search over many attacker strategies with pruning and ranking.
BON: You have a single prompt and want breadth via N augmented variants (quick prompt optimization).

📊 Dashboard Notes

The Visual Progress panel shows Best Score (0–10) and a Prompt preview.
To populate these meaningfully, pass --eval-model and --judge-template so rounds/iterations receive real numeric scores.
Without a judge, the UI still progresses, but the score bar will remain low/zero.

📚 References

Crescendo: The Multi-Turn Jailbreak — method concept and multi-turn strategy overview.