Skip to main content

🎚️ Crescendo (Cres)

Crescendo is a multi-turn, refusal-aware prompting tactic. Instead of throwing one clever prompt at the target, Crescendo asks a short sequence of increasingly targeted questions, carries forward the conversational context, and backtracks when the target refuses. Optionally, a judge model scores each round so you can pick the best run.

Conceptually:

  • A (Question Generator) proposes the next single question (and a 1-line status) given the task, history, and last answer.
  • T (Target) answers each question with full conversation context.
  • Backtracking: if T refuses, Crescendo doesn’t commit that turnβ€”tries a different question (up to a limit).
  • J (Judge, optional) scores each round; the run’s score is typically the average of per-round scores.
  • Best-of-N: repeat for N iterations and keep the best run by score.

Crescendo is not a jailbreak method by itself; it’s a structured prompting strategy to reach allowed objectives via small, steerable steps.


⚑ Quick Start

Run Crescendo from the unified CLI with minimal arguments:

dtx attack run --attack CRESCENDO \
--task "Explain orbital rendezvous in simple steps."

Recommended (adds a numeric score and a proper Best Score bar in the dashboard):

dtx attack run --attack CRESCENDO \
--task "Explain orbital rendezvous in 6 steps." \
--eval-model gpt-4o-mini \
--judge-template chao2023pair

Environment:

  • Set OPENAI_API_KEY for the answer model (target).
  • If you choose a Groq-only question model (e.g. deepseek-r1-distill-llama-70b), set GROQ_API_KEY and use --cres-questions-provider groq (or auto if you want it to prefer Groq when available).

πŸ–ΌοΈ Workflow

Crescendo Workflow

  1. Initialize Choose N iterations and R rounds. Provide your task (the overall goal).

  2. Generate Next Question A produces (q, s) where q is the next single question and s is a one-sentence status. It conditions on the task, attack history (previous q/s/e/refusals), and last target response.

  3. Ask Target Build a conversation from committed history and ask T:

    • If T refuses and you’re under the refusal limit, backtrack (do not commit the turn) and try a different question.
    • Else commit the Q/A to history.
  4. (Optional) Judge J scores this round (e.g., 0–10 mapped from 0–1). Keep a per-round score list.

  5. Finish Iteration Aggregate scores (e.g., average of rounds). Track the iteration’s best preview and score for the dashboard.

  6. Select Best After N iterations (or early success), return the best run: its questions, final answer, and score.


πŸ”§ Advanced CLI Options (CRESCENDO)

CLI flagMeaningNotes
--cres-question-model NAMEModel for question generation (A)Groq or OpenAI model name
--cres-questions-provider {auto,groq,openai}Backend for the question modelauto prefers Groq when GROQ_API_KEY exists
--cres-answer-model NAMETarget model (T)OpenAI model name
--cres-iterations NIterations (outer loop)Best-of-N selection
--cres-rounds RRounds per iterationMax Q/A turns per run
--cres-refusal-limit KBacktrack attempts per iterationAvoids burning turns on refusals
--cres-qgen-temperature TQuestion generator temperatureDefaults sensible if omitted
--cres-qgen-max-new-tokens NQuestion generator token capβ€”
--cres-answer-temperature TTarget temperatureβ€”
--cres-answer-max-new-tokens NTarget token capβ€”
--eval-model NAMEJudge model (optional)Enables numeric Best Score
--judge-template NAMEJudge rubric templatee.g., chao2023pair

Example with knobs​

dtx attack run --attack CRESCENDO \
--task "Explain orbital rendezvous in 6 steps." \
--cres-question-model deepseek-r1-distill-llama-70b \
--cres-questions-provider auto \
--cres-answer-model gpt-4o-mini \
--cres-iterations 3 --cres-rounds 6 --cres-refusal-limit 2 \
--cres-qgen-temperature 0.25 --cres-qgen-max-new-tokens 512 \
--cres-answer-temperature 0.7 --cres-answer-max-new-tokens 250 \
--eval-model gpt-4o-mini --judge-template chao2023pair

When to Use Crescendo vs. TAP / PAIR / BON

  • Crescendo: You want a guided, conversational path to the goal with refusal-aware backtracking and context carryover.
  • PAIR: You prefer iterative prompt refinements on relatively linear streams (no conversational Q/A chain).
  • TAP: You need deeper tree search over many attacker strategies with pruning and ranking.
  • BON: You have a single prompt and want breadth via N augmented variants (quick prompt optimization).

πŸ“Š Dashboard Notes

  • The Visual Progress panel shows Best Score (0–10) and a Prompt preview.
  • To populate these meaningfully, pass --eval-model and --judge-template so rounds/iterations receive real numeric scores.
  • Without a judge, the UI still progresses, but the score bar will remain low/zero.

πŸ“š References

  • Crescendo: The Multi-Turn Jailbreak β€” method concept and multi-turn strategy overview.