Hacker News

Save Claude Code Tokens with Smart Routing

11 points by FrancescoMassa ago | 3 comments

nithiink |next [-]

How do you handle prompt caching? A lot of cost savings for a single model chat come from cache hits on the conversation context, and switching models invalidates that cache — the new model has to reprocess everything at full input price.

|next |previous [-]

patch_dev |previous [-]

What does this solve that well used subagents doesn't solve already?

FrancescoMassa |root |parent [-]

On our tests subagents & well used workflows are 20-30% more expensive for context & token efficiency