Hacker News
Show HN: ctx – Search the coding agent history already on your machine
But you do have months of full-fidelity agent transcripts stored on your machine.
A simple solution that goes a long way: ingest those transcripts and logs into a structured SQLite database, then search them with ranked text match. Everything is fully local and doesn't require anything fancy like a graph database or hosted memory service.
This is the idea behind ctx, a Rust CLI that handles the ingestion and searching.
We give our agents a skill that tells them to reference past sessions before working in an area. Usually we do this through an "Agent History Research Subagent" whose job is just to prepare a short brief covering any relevant history before the task begins.
A real example: sometimes our test suite runs would fail because disk was full on the runner. The correct approach was to run the cleanup runbook, but the root cause of the failure was not clear to the agents, so they would think it was a test regression and go down the wrong rabbit hole debugging. When the agent searched history, it realized this failure had been encountered before and found the right workaround immediately. That got the agent onto the right cleanup path, and later we improved the log output so the same failure would be clearer next time. It's a boring story, but it's real agent productivity.
Another nice use case is quickly generating session transcripts for sharing. You can exclude the noisy intermediate messages, so the transcript shows the important parts of the session more cleanly. Try attaching a session transcript to your next PR so your teammate and their agent can review the provenance and prompting behind the change.
If you're up for an additional challenge, ask your agent to "exhaustively review all agent history in this repo and find where the SDLC is struggling or isn't agent-native". Using past sessions to recursively improve the agentic SDLC is a loop that we're using a lot today.
If you try it out, please let us know what you think!
wrs
|next
[-]
luca-ctx
|root
|parent
|next
[-]
At first I thought the main improvement would be that the search would be faster, but rg is already pretty freakin fast when the fs cache is warm.
What really ended up being the big efficiency improvement is the token efficiency. When you structure all of the transcripts in a SQL table, the agent can retrieve exactly what is needed (such as "print me the lite transcript, without the intermediate messages").
beaugunderson
|next
|previous
[-]
Terretta
|root
|parent
|next
[-]
Meanwhile, the other todo list take, one I've undertaken as well, is to cross sync all the Claude Codes across all its instances on all your machines.
There are multiple projects that claim to do this. None do it fully. (They particularly have blind spots to tools that embed a Claude Code, such as the Xcode 26.5 and Xcode 27 beta.)
So: roll one's own, and in doing that, realize that it has first class tools to make back referencing transcripts normal.
Given those tools, you don't really need an extra layer.
beaugunderson
|root
|parent
[-]
luca-ctx
|root
|parent
|previous
[-]
How have you enjoyed the semantic search?
beaugunderson
|root
|parent
[-]
a couple of times I was certain that there was a session that contained some word but in reality it was in my personal claude.ai web account, so needed to add the import functionality there.
my favorite piece is the `corrections` command which surfaces all my frustrations/corrections in the last week for example... and I can then figure out if missing context would improve those scenarios going forward
luca-ctx
|root
|parent
[-]
And yea on the import thing, there are quite a few instances when session records can live on other machines, like cloud agents, dev boxes, etc.
Do you have any interest in sharing some transcripts with team members? I'm trying to figure out the shape of this solution because often times people I work with want to see what I did or fork one of my sessions, but I also don't necessarily just want unlimited dumping because I'm sure I have personal details in there too.
AM1010101
|next
|previous
[-]
luca-ctx
|root
|parent
|next
[-]
Terretta
|next
|previous
[-]
Sure they can. Just ask them. Some (like Claude Code) even have built in tools for it that work a treat. It'll happily rebuild an entire edit history diff by diff.
luca-ctx
|root
|parent
[-]
The bigger point is that when they do go spelunking in the old session logs, it is extremely token inefficient, and you can often fill up an entire context window and force a compaction just by trying to put together a transcript or summary.
The goal here is less of doing something previously impossible, but doing it in a way that makes it so efficient and cheap that you can have agents do it very often, like before they start on every single task.
alex_hirner
|next
|previous
[-]
However, I'm puzzled by pi support: https://github.com/ctxrs/ctx/issues/40
luca-ctx
|next
|previous
[-]
scritty-dev
|root
|parent
[-]
dang
|root
|parent
[-]
Of course, it's impossible to know for sure what was LLM processed or not, but some of your posts (like this one) have been getting classified that way.
scritty-dev
|root
|parent
[-]
meowface
|root
|parent
[-]
dang
|root
|parent
|next
[-]
There's also a large fuzzy area these days where people are using tools to edit, "polish", etc., but do not think of it as using an LLM to write. This is particularly the case with non-native English speakers.
A few recent cases where this sort of thing came up:
https://news.ycombinator.com/item?id=48467726
scritty-dev
|root
|parent
[-]
This is ridiculous if I use proper grammar and punctuation I get flagged as AI.
`if i talk like this and dont use proper syntax and convention then i come off as an unintelligent and fake version of myself i am watering down to appease whatever algorithm flagged me as ai`
https://preview.redd.it/ai-detector-flagged-a-passage-from-m...
...now I am going to speak like myself again -- AI sounding and all (oh, double dash is not em dash I've been using them for 2 decades)
This is just to highlight how ridiculous this all is and honestly off-putting to a new member, forget non-native English speakers, all AI algorithms do is flag polished post. I don't want to water myself down and act dumb to avoid an algorithm and tip-toe around every post I make.
scritty-dev
|root
|parent
|previous
[-]
meowface
|root
|parent
[-]
And for the record you were downvoted by other people long before I saw your reply.
scritty-dev
|root
|parent
[-]
Again with these demeaning comments "you are certain" -- who exactly are you -- the arbiter of what constitutes human vs AI generated content? Yes, I am certain.
EDIT: After testing my own content on GPTZero, I am curious, is that specific platform utilized to determine if my text is AI generated?
dang
|root
|parent
|next
[-]
I think you may find that this "auto spell checker" is making many more changes to your text than just spelling corrections. We've encountered this sort of thing in many cases already. This is the sort of thing I was describing in my comment upthread: https://news.ycombinator.com/item?id=48779752.
meowface
|root
|parent
|previous
[-]
It is possible you were not intentionally choosing to use an LLM to write/modify your posts, but they largely read like LLM output. The tool you're using may use an LLM and may be rewriting significant portions of your text.
malandin
|next
|previous
[-]
luca-ctx
|root
|parent
[-]
We considered this, but the main thing you gain from this tradeoff is some disk space and cleaner retention semantics from not having to duplicate all of the searchable text.
But you still have to do the parsing and ingestion work to build the index in the first place, so CPU time does not go away.
And you still have to store the indexes and enough metadata to map results back to the raw session files, which bounds the benefit of not duplicating the data.
The main downside is flexibility (you would lose the ability to do arbitrary SQL queries, semantic search on top of structured corpus, etc)
But I would love to see if I can be proven wrong on this!
sinisha_djukic
|next
|previous
[-]
luca-ctx
|root
|parent
|next
[-]
Creating ground truth is an orthogonal problem - I try to work hard to put it into specs and docs and regularly update those.
Searching history is closer to "super git blame" or like looking through logs. We should expect a lot of stuff went wrong in there.
meowface
|next
|previous
[-]
zaptheimpaler
|next
|previous
[-]
indigodaddy
|previous
[-]
luca-ctx
|root
|parent
|next
[-]
The idea is that even with native recall from Shelley, ctx results are more accurate, ergonomic, and token efficient
For example search can retrieve a specific message and then window for trailing and leading N messages, in just a few hundred tokens