Hacker News
Show HN: Imagent – agentic image/video/speech generation
ankurchrungoo
[-]
unliftedq
|root
|parent
[-]
1. Existing CLI solution is provider specific, like minimax cli, chatgpt cli, etc. and for other providers, there's no built-in CLI support.
2. For local CLI/scripts solution, the generation result/history is not tracked, sometimes, I want to generate a similar image, I have to keep the prompts in a notebook. Now, with imagent, I can simply remix any prompt from the library.
3. CLI is the best solution for agent automation, I can use the cli to generate slides, blog illustrations, website assets, etc. And with it, I can even generate videos with hyperframes/remotions with great illustrations and speech audios. All it done by agent, I don't need to create images, audios by myself.
4. Agent isn't aware of the difference/limitation of different models, by maintain the catalogs, agent can discover what is available and choose the best options as its need. And call it in a unified interface.
So, the cost of video generation is not my focus at this point, what I care is automation. If we give the agent such ability, what the agent can create for us automatically. Not vibe code, but vibe creation.