What is ultracode in Claude Code and how is it different from setting reasoning effort to xhigh?

ultracode is a session-level setting in Claude Code, not a separate model. You activate it with /effort ultracode, which sends xhigh reasoning effort to the model and additionally enables automatic workflow orchestration. The key difference from just setting effort to xhigh: for tasks too wide to fit one context window, ultracode doesn't only think harder, it lets Claude plan a multi-agent workflow, fan the work out across parallel subagents, and save progress so interrupted runs can resume.

When did ultracode launch and which Claude model does it use?

ultracode shipped on May 28, 2026, alongside Claude Opus 4.8. It isn't a model itself; it's a Claude Code setting that runs Opus 4.8 at xhigh reasoning effort with automatic workflow orchestration layered on top. Because it's a session-level setting, you turn it on for a heavy task with /effort ultracode and drop back to high once the task is done.

Should I use ultracode or just xhigh reasoning effort for a large analysis task?

Use plain xhigh effort when the task fits in one context window and you just want deeper reasoning. Reach for ultracode when the work is too wide for a single context, since only ultracode adds automatic workflow orchestration that fans the task out across parallel subagents and resumes after interruptions. In a personal experiment mining 30 days of chat transcripts (~354K tokens) to turn recurring corrections into CLAUDE.md rules, that fan-out was what made the job tractable in one pass.

Can Claude Code run parallel subagents in a multi-agent workflow?

Yes. Claude Code supports parallel subagents, agent teams, and workflow orchestration, and it can design the orchestration itself. In this experiment Claude planned a multi-agent workflow of 18 agents total: 17 extractor subagents running in parallel, a hard barrier that waits for all 17 to finish, then one synthesizer agent that clusters and ranks the combined findings. The full run used roughly 1.37M subagent tokens and finished in about 5.5 minutes of wall-clock time.

How do you process text that is larger than the LLM context window?

When the corpus is bigger than a single context window, no one agent can read all of it before hitting a wall. The fix is a map-reduce fan-out: split the text into context-sized chunks, have many agents each read one chunk in parallel, then hand their combined findings to a single agent to synthesize. In this case ~354K tokens of chat history could not fit in one context window, so the work was split across 17 parallel extractor agents reading ~24K-token chunks each.

Where does Claude Code store session history and how can I analyze my past conversations?

Claude Code saves every session as a JSONL transcript under ~/.claude/projects/, with one file per conversation organized into per-project directories. Because these JSONL transcripts are plain logs on disk, you can parse them to analyze your own past conversations. In one experiment, 30 days of history amounted to 264 MB across 1,012 files and 140 project directories, which distilled down to 257 real conversations and 2,089 of the author's own messages.

What are Claude Code CLAUDE.md best practices for stopping repeated corrections?

The most effective practice is to consolidate the corrections you give every session into one place rather than re-teaching them ad hoc. The author mined 30 days of his own Claude Code transcripts and found his recurring corrections (be concise, don't touch unasked things, answer first) were never written down, so he paid the re-teaching cost every session. Writing them as permanent rules in the global ~/.claude/CLAUDE.md grew it from 2 lines to 23 high-confidence rules across 6 sections and eliminated the repetition.

How can I customize Claude Code behavior using my own past conversations?

Your past sessions are effectively free training data: the author had Claude read 30 days of his transcripts (257 conversations, ~354K tokens), rank his recurring patterns by frequency (conciseness alone showed up 28 times), and rate each finding high, medium, or low confidence. He kept only the high-confidence findings and turned them into rules to customize Claude Code behavior, ending with 23 global rules plus 4 repo-specific ones for one project. As he put it, the transcripts were the training data; the model just needed someone to read them.

How do I stop Claude Code from repeating the same corrections every session?

The friction usually isn't that Claude fails to learn within a session, but that your preferences were never written down in one place, so you re-teach them each time. The author solved this by mining his Claude Code session transcripts (2,089 of his own typed messages over 30 days) and moving the most frequent corrections into permanent instructions in CLAUDE.md. Once a rule lives there, Claude reads it every session and you stop re-correcting the same things.

I Had Claude Mine 30 Days of My Own Messages. What It Found Was Uncomfortably Accurate.

I stopped reading Claude's replies.

Not because they were wrong. Because with Opus 4.7 and 4.8, they were just too long. I run Claude at high autonomy — I give it a task, I walk away, I come back to results. That dynamic works beautifully, right up until Claude decides to explain everything it did in four paragraphs when I needed one sentence.

So I kept correcting it. Be concise. Stop touching things I didn't ask about. Answer the question first. Over and over, session after session, project after project.

At some point I realized: these aren't task bugs. They're behavioral mismatches. And I'd been re-teaching the same corrections on repeat.

The corpus I didn't know I had

Claude Code stores every session as a JSONL transcript under ~/.claude/projects/. I knew this in the abstract. I'd never actually looked.

When I did: 264 MB across 1,012 files in 140 project directories — all from the past 30 days. After filtering out noise, that was 257 real conversations. From those, 2,089 genuine messages I had typed. My own words. ~354K tokens of corrections, preferences, and frustrations accumulated across a month of daily, heavy use.

The problem was obvious: the corrections were already there, written hundreds of times. I'd just never consolidated them.

What ultracode is

Ultracode is a setting in Claude Code, not a model. You activate it with /effort ultracode. What it does: it sends xhigh reasoning effort to the model and enables automatic workflow orchestration — meaning Claude can decide on its own whether a task is large enough to warrant spinning up a multi-agent workflow. It shipped on May 28, 2026, alongside Opus 4.8.

The practical difference from just setting effort to xhigh: for tasks that are too wide for a single context window, Claude doesn't just try harder — it fans the work out. It plans a workflow, shows you the phases before executing, and runs parallel subagents that each handle a slice. Progress is saved throughout, so interrupted runs can resume.

It's a session-level setting. When the heavy task is done, you drop back to high.

What ultracode actually unlocked

354K tokens of material can't live in one context window. No matter how hard a single agent tries, it hits a wall before getting through the full corpus.

I set /effort ultracode, described what I wanted — read every past conversation, find the behavioral patterns, turn the durable ones into permanent instructions — and gave one constraint: no writes until we agree on findings.

Then Claude showed me a workflow plan before doing anything. Two phases: a parallel extraction stage across 17 chunks, then a single synthesis pass over all findings. I approved it. Claude ran it.

I didn't design the orchestration. Claude did.

What it built was a clean map-reduce: 17 extractor agents running in parallel, each reading one ~24K-token chunk of my transcript history, each returning structured findings in a uniform schema. Then a hard barrier — wait for all 17 — flatten 256 raw findings, and hand them to one synthesizer agent to cluster and rank by cross-repo frequency.

The terminal showed the pipeline as it ran: each agent's status, what it was processing, when it completed. Watching your own words get analyzed at that scale is a strange experience. 18 agents total. ~1.37M subagent tokens. 5.5 minutes wall-clock.

I walked away. Came back to results.

The mirror

The synthesis opened with a profile:

Terse, high-autonomy, skeptical, with strict git/PR discipline. Hates repeating himself, hates verbosity, hates unverified "done." Profanity is a reliable frustration signal.

Reading that felt like someone had been watching over my shoulder for a month and finally handed me the notes. Every word was accurate. Uncomfortable in the way that accurate things sometimes are.

The patterns came ranked by frequency across repos. Conciseness: 28 hits. Parallelize with sub-agents: 15. No AI attribution in commits: 13. Plan first, wait for approval: 12. Red-to-green tests only: 13.

Claude had also rated each finding by confidence — high, medium, low — and suggested taking only the high ones. I agreed. That filter turned out to be exactly right: the low and medium findings were precisely the ones I wasn't sure about. The 23 high-confidence rules were the ones I'd been teaching Claude on repeat, project after project, for months. Claude suggested its own quality filter, and it was correct.

What came out the other side

My ~/.claude/CLAUDE.md went from 2 lines to 23 rules across six sections: Communication, Autonomy, Git & PRs, Engineering rigor, Scope, Persistence & safety.

One project got its own additions — four repo-specific rules that only surfaced in that codebase's history, like "[skip ci] on all but the last PR in a batch" — a pattern I'd enforced repeatedly to avoid triggering a release on every intermediate commit.

The meta-point isn't the rules themselves. It's that the friction wasn't Claude failing to learn my preferences. It's that I'd never written them down in one place, so I paid the cost of re-teaching them every session.

The transcripts were the training data. The model just needed someone to read them.

What this requires

This is exactly the kind of task ultracode makes easy. Not because of the reasoning level — because of the fan-out.

354K tokens can't live in one context window. But 17 parallel agents, each reading a ~24K slice, with a synthesizer waiting at a hard barrier for all of them? That's the right shape for this problem. And it's the shape Claude designed — without being asked to.

The effort level sets the ceiling. What Claude builds inside it is still up to Claude.

I Had Claude Mine 30 Days of My Own Messages. What It Found Was Uncomfortably Accurate.

The corpus I didn't know I had

What ultracode is

What ultracode actually unlocked

The mirror

What came out the other side

What this requires

Frequently Asked Questions

Found this helpful? Share it!

Quick Links

Connect