Tricks to Save Tokens

1. CLAUDE.md — The Most Underrated File in Claude Code

Every Claude session starts fresh. Claude has no memory of what you built yesterday, what tech stack you use, or what your project even is. So every time you start a new session without a CLAUDE.md, you're either re-explaining your project from scratch (expensive) or watching Claude hallucinate your stack (more expensive).

What I do: I keep a CLAUDE.md file in my project root. Claude reads it automatically at the start of every session. It contains: what the project is, tech stack and architecture, key files and folder structure, my workflow preferences, and things Claude should never do.

That file loads once. Context is set. I don't waste 500 tokens re-explaining myself.

The insight: CLAUDE.md isn't a config file. It's you giving Claude a persistent identity in your project. Without it, every session starts with a stranger.

2. /clear — Your Mid-Session Reset Button

When you're deep in a session and switch to a different task, all that previous conversation context is still in the window. Claude is still 'carrying' it on every single message. Every token in your conversation history gets re-sent to the model with each new message. A long history = expensive messages.

What I do: /clear wipes the conversation history. I use it when: I finish a task and move to something unrelated, a conversation goes sideways, or I've been debugging for 30 min and the context is polluted.

It's free. It takes 2 seconds. It's the fastest way to stop paying for context you don't need.

3. Start a New Session for Every New Task

Each Claude Code session is isolated. If you're fixing a bug in one repo, then ask Claude to help you write a LinkedIn post, you're dragging all that repo context into a completely unrelated task.

What I do: New task = new claude session. Full stop. Your CLAUDE.md handles project context automatically. There's no reason to maintain a 3-hour session just because you're scared of 'losing context.'

4. /model — Match the Model to the Task

Not every task needs Opus. Using Opus to rename a variable is like hiring a surgeon to put on a bandaid.

My model stack:

Quick edits, single-file changes, renaming things → haiku
Writing features, debugging, explaining code → sonnet (default)
Architectural decisions, complex multi-file planning → opus

code

/model haiku
/model sonnet
/model opus

The real move: Start in Haiku or Sonnet. Only escalate to Opus when the task actually needs it. Most tasks don't.

5. /compact — Compress Without Losing Context

When you're in a long session and don't want to /clear (because you still need the context), use /compact instead. Claude summarises the conversation into a condensed version and replaces the history with that summary. You keep the important context. You drop the token weight.

When I use it: Mid-session, after completing a big chunk of work, before starting the next chunk. Think of it as defragging your context window.

6. Memory Files — Stop Re-Explaining Yourself

Beyond CLAUDE.md, I use dedicated memory files for things that span across sessions and projects. Mine live in ~/.claude/memory/. Examples: primer.md (where I left off), learnings/claude-cli.md (rules from past corrections), watson.md (my EA agent's persistent memory).

At the start of each session, Claude reads these files. I never re-explain my business context, my preferences, or lessons from past sessions.

The pattern: Anything you find yourself re-explaining more than twice belongs in a memory file.

7. Sub-Agents — Isolate Context, Don't Pollute It

When Claude explores your codebase, searches the web, or reads a bunch of files — all of that lands in your main context window. I use the Agent tool (sub-agents) to offload work that would otherwise bloat my context. The sub-agent does the research, returns a summary, and my main window stays clean.

What goes to a sub-agent:

'Explore this codebase and tell me how the auth system works'
'Search the web and summarise the top 3 approaches to X'
'Read these 8 files and identify all the places that touch payments'

8. Specific Prompts Over Vague Ones

Vague: 'Fix the bug in my app' → Claude asks 3 clarifying questions → 600 tokens burned before any work started.

Specific: 'In src/api/auth.ts line 47, the JWT expiry isn't being checked. Fix it to verify exp on every request.' → Claude does it in one shot.

Every unnecessary back-and-forth round is tokens burned doing nothing. The more context you front-load, the less conversation you need.

9. Plan Mode (/plan) — Think Before You Act

Every time Claude jumps straight into implementation without a plan, you risk it going 400 tokens in the wrong direction and needing to redo everything.

What I do: For anything non-trivial, I use plan mode. Claude thinks through the approach, lists the steps, then I say 'go.' It costs a few tokens upfront. It saves a lot more by not burning tokens on wrong-direction work.

10. --continue / --resume — Re-Use Sessions

When you close a Claude session and come back to it, you don't always need to start fresh.

code

claude --continue    # continues the most recent session
claude --resume      # lets you pick which session to resume

If you're mid-task and get interrupted, this picks up exactly where you left off — without re-loading context or re-explaining what you were doing.

11. The Auto-Compact Setting

In your Claude Code settings, you can turn on automatic context compression. When your context window hits a certain threshold, Claude compresses it automatically. You don't have to remember to /compact. It just happens.

Find it in: Settings > Auto-compact conversations

12. Don't Ask Claude to Explore — Tell It Where to Look

'Explore the codebase and understand how X works' = Claude reads everything = expensive.

'Read src/services/payment.ts and explain how the charge function works' = Claude reads one file = cheap.

The habit: Before asking Claude to understand something, take 30 seconds to identify the relevant files yourself. Point Claude directly at them. You cut 80% of the exploration cost.

13. Opusplan

Type in your terminal

code

/model opusplan

To have opus do all the planning and high level architecture, and sonnet to do all the actions and following the plan from opus.

My Personal Stack — What I Do at the Start of Every Session

CLAUDE.md exists in every project root (loads automatically)
Session-start hook reads primer.md (where I left off)
I pick the right model before starting — usually Sonnet, Haiku for quick stuff
I start a fresh session per task (not one mega-session per day)
I /compact mid-session when switching between major tasks
I write specific prompts with file paths and line numbers
I use /plan before anything that touches multiple files

Tricks to Save Tokens

1. CLAUDE.md — The Most Underrated File in Claude Code

2. /clear — Your Mid-Session Reset Button

3. Start a New Session for Every New Task

4. /model — Match the Model to the Task

5. /compact — Compress Without Losing Context

6. Memory Files — Stop Re-Explaining Yourself

7. Sub-Agents — Isolate Context, Don't Pollute It

8. Specific Prompts Over Vague Ones

9. Plan Mode (/plan) — Think Before You Act

10. --continue / --resume — Re-Use Sessions

11. The Auto-Compact Setting

12. Don't Ask Claude to Explore — Tell It Where to Look

13. Opusplan

My Personal Stack — What I Do at the Start of Every Session

Learn this inside the community