Hacker News
Gemini API File Search is now multimodal
FrequentLurker
|next
[-]
pants2
|root
|parent
|next
[-]
Everyone thought Google was pulling ahead with Gemini 3. For a minute there they had the best language model, image model, AND video model in the world. But it's like they decided to pull over for a nap while OpenAI and Anthropic flew by.
diegoperini
|root
|parent
|next
[-]
comboy
|root
|parent
|next
|previous
[-]
It just feels like many google products really, they are capable of really amazing things, it's just that nobody there seem to care. I would guess they are likely optimizing more for internal use than their vast userbase.
wilj
|root
|parent
|next
|previous
[-]
So I put together a plan for refactoring it, step by step, with tests, etc. After literally 8 solid days of fighting with Gemini 3 Pro, I still couldn't pull it off.
I gave GPT 5.5 a chance with the same prompt, plans, and repo. I'm not sure how long it took, but when I checked in on it a few hours later it was done. All tests passed, everything exactly how I'd asked, and better (it made some improvements).
qingcharles
|root
|parent
|next
|previous
[-]
sega_sai
|root
|parent
|next
|previous
[-]
stingraycharles
|root
|parent
|next
|previous
[-]
You’d think this would be fairly obvious for Google to do, but it’s probably an organizational problem rather than a technical one.
FirstPoint
|next
|previous
[-]
trilogic
|next
|previous
[-]
How much would you pay to have this yours forever, running locally, GDPR and HIPaa compliant, without the headache of privacy or subscriptions.
That´s what we offer with HugstonOne and we did it before Google. Multimodal, Lighting fast RAG, terabytes not kilobytes only :)
All you need is a 32gb ram laptop and HugstonOne, not a rocket science.