a discord bot for the tootsies server. ask, recap, discuss, ship features by typing.
Context for the per-model prompt rebuild epic (#849). Pairs with issue #858 and PR #856.
/ask over-runs its length cap (1–2 sentences for a take) — but ONLY in a
back-and-forth / thread continuation, NOT on cold takes.constitution.py) reword of the roast rule — owner-merge,
separate from PR #856. The ask-core (Layer 3) has NO roasting rule, so patching length
there is fixing the wrong layer.claude_client.py MODEL_RULES[model] -> ModelRules,
resolved by rules_for(model) at call time. Opus = _OPUS, a replace(_LEGACY, ...).skip_persona=True; its ENTIRE system prompt =
lean_persona + "\n\n" + assembled_ask_system_extra (claude_client.py ~4902-4913, _call ~2996).CONSTITUTION (constitution.py): HARD_RULES (incl. 9-bullet DATA INTEGRITY) +
HOUSE_RULES (10) + CALIBRATION. NOT Opus-optimized. Shared by all models._OPUS_ID (claude_client.py:2774) + _OPUS_VOICE (:2780). #851._OPUS_ASK_CORE (:2865): TASK, _OPUS_TAGS, _OPUS_GROUNDING, _OPUS_FACT_TAKE,
_OPUS_MEMORY, _OPUS_IMAGE, _OPUS_LENGTH, _OPUS_TOOLDISC, _OPUS_CITE, _OPUS_REGULARS. #851.answer_length events: sentences [2,2,3,3,3,4,5,5,5,6] -> 8/10 >=3 sentences, median 3.5,
max 6; chars median 269, 8/10 over 200.ask_answered pairs show the long ones are thread-continuation roasts (“you’re such a
hater”, “wtf?”), not explainers/lists.is drake done -> 2 tight sentences; explainer
unlocks; p_take/p_roast land at 2-3. => the harness as it was CANNOT see the drift,
because every case is single-shot.Added back-and-forth cases (her own prior jab + a user volley as context), strict <=2 bar:
Anatomy of an over-long reply: core take is 1 sentence (“-600 isn’t a comeback, it’s a cover charge”), wrapped in (1) a self-defense vs the accusation (“hate’s a strong word, i clapped for the jersey retirement”) + (2) a warm softener/offer (“the rafters miss you”, “find me a number and i’ll cheer”). That defend-then-soften wrapper is the amplifier.
Ablation (bf_* avg sentences, want <=2):
| variant | bf_hater | bf_volley |
|---|---|---|
| baseline | 3.0 | 3.0 |
cut _OPUS_REGULARS (don’t-punch-down) |
3.3 | 2.3 |
| add a thread-tightness clause | 1.3 | 1.7 |
_OPUS_LENGTH: a take is 1-2 sentences._OPUS_REGULARS is about not-villainizing + staying in lane,
not roast length. So the length-relevant roast sanction lives ONLY in the un-optimized L1,
contradicting _OPUS_LENGTH. Fixing in L3 = stacking a counter-rule on top of the real one._OPUS_REGULARS;
DATA INTEGRITY (9 bullets) overlaps the lean _OPUS_GROUNDING.constitution.py is a protected path (owner merges) and affects
EVERY surface => deliberate, eval-gated.L1 CALIBRATION (constitution.py:50) — PROTECTED:
L2 _OPUS_VOICE (claude_client.py:2780):
L3 _OPUS_REGULARS (claude_client.py:2855):
L3 _OPUS_FACT_TAKE (claude_client.py:2812):
NOTE: _VOICE_REMINDER (claude_client.py:560, incl. a second REGULARS RULE at :589) is NOT in
Opus’s prompt — the lean ask_core replaces the legacy wall. It applies to Sonnet/legacy + other
surfaces only.
claude/opus-ask-length-rules-t04u40The “fix direction” above (reword the Layer-1 roast line) turned out to be WRONG, and the correction is the real lesson here. Measured across THREE ablation rounds on the real ask pipeline (n=5), every doctrine-clean lever washed:
| lever (n=5, bare bf_hater) | result |
|---|---|
| cut “full takedown” (constitution) | 3.4 -> 2.8, still >2 |
| reword the roast line (constitution) | 3.4 -> 3.0, still >2 |
| cut “not mean” (voice) | 3.6 -> 3.0, still >2 |
| reword/fold tightness into voice | 3.6 -> 2.6, still >2 |
explicit thread-tightness clause in _OPUS_LENGTH |
-> 1.3-1.7, <=2 ✓ |
So: cutting/rewording the roast line is a near-wash on the back-and-forth length, AND the voice layer washes too. The obvious-cause (the roast rule) was not the lever. The only thing that moved the metric was an EXPLICIT length clause — i.e. an add.
This is not a doctrine violation, it’s the doctrine’s own method working. The rule
is “don’t instinctively add” — ablate clean cuts/rewords first. We did, exhaustively;
they failed; so a measured add is the justified call. The shipped fix (_OPUS_LENGTH:
“in a back-and-forth, land ONE comeback and stop, no defending whether you’re a hater,
no warm wind-down”) is that add, validated: bf_* flip to <=2, over-cut guards
(explainer/list/ranking) stay unlocked, roast stays sharp, lane holds.
Shipped:
OPUS_CONSTITUTION),
on prose-hygiene grounds, NOT as the length fix._OPUS_LENGTH thread clause) + the STAY-IN-YOUR-LANE L1<->L3 dedup:
the #858 close-out PR.For docs/PROMPT_OPTIMIZATION.md: its #858 worked example currently prescribes
“reword the Layer-1 roast line” as the fix — that’s the disproven prescription. It should
be updated to: clean cuts/rewords (constitution AND voice) all washed across 3 rounds;
the evidence-justified fix was a measured _OPUS_LENGTH add after the clean levers were
proven to fail. The deeper lesson stands and is sharpened: ablate cut/reward/add
separately, and “don’t add” means “don’t add first,” not “never add.”