Hacker News
Biohub releases a world model of protein biology
a_bonobo
|next
[-]
Modeling protein-protein binding is still a massively unsolved problem, mainly because we don't really have the data. Alphafold2 was great but didn't actually 'solve' protein-folding as all input data is from single 'state' X-ray crystallography of the proteins, not 'really' how these proteins behave in the wild. So it's still very, very had to predict what binds to what, which of course is a multi-billion-dollar industry.
I work in a pharma-field and I wish we could easily design molecular binders. We still spend millions every year finding targets that could 'smuggle' our drugs into cells.
Some other players in this field are Boltz Lab and Isomorphic Labs (the Alphafold Google spinoff led by Hasabi). None of them can predict anything complex or 'big', everything is peptide-level. OP's work is another step towards something better.
The most interesting part in the preprint is that they find no matches for their designed binders in the world-write protein database. An open question with protein-designers is whether they just regurgitate training material, which is far easier to test with English-language models.
bonsai_spool
|root
|parent
|next
[-]
Do you need to predict when AP-MS is so cheap?
Mapping interaction interfaces is challenging and is where there is attention. I don’t think we’re going to get complexes as a commercial focus outside of receptors with known quaternary structure. The first issue, as you allude to, is the absence of training data, which itself highlights the relative commercial unimportance of such an endeavor.
Frannky
|next
|previous
[-]
marcuskane2
|root
|parent
|next
[-]
It's not that HN readers lack intellectual curiosity or have some character flaw or narrow worldview, it's just that few people are reading and commenting between the late hours of Saturday and early morning of Sunday. It's 6 am Sunday in California as I post this.
ethanwillis
|root
|parent
|next
|previous
[-]
They might like to think they are, they might try to pretend they are, but when pushed they're simply not.
Look at all of the groupthink that is perpetuated nonstop while they also proclaim they're creating, investing in, etc. so many unique ideas. Yet year after year it's the same thing in a different color.
What they actually are is interested in money and prestige. So give it a little time and they'll learn enough about biology to try and get some validation from their peers with comments. If money actually pours into bio that is.
a_bonobo
|root
|parent
[-]
The HN/YC crowd generally has software brain: https://www.theverge.com/podcast/917029/software-brain-ai-ba..., "when you see the whole world as a series of databases that can be controlled with the structured language of software code". Biology doesn't work like that most of the time, it's squishy and weird and unpredictable, and the models we have of biology (including genomics!) are faulty at best, misleading at worst. I've supervised PhD-students and it takes some time for people's brains to be comfortable with that squishiness, that random behaviour, that 'putting A into the system only rarely produces B and we don't really know why but we do it anyway' view of the world. Software engineers struggle, even abhor that kind of world, which is why you rarely see them being interested in it; and if they work in it, outcomes are sometimes amazing and Nobel Prize worthy, more often nonsense that silently disappears.
semiinfinitely
|root
|parent
|next
[-]
swasheck
|root
|parent
|next
|previous
[-]
interesting. i came to tech from a molecular biology background and my impression was the opposite. biology is predictable most of the time, but sometimes random and squishy. the trick is that we’re trying to learn why things work predictably and what causes the variations, and that why/how unknown is what is most uncomfortable for people outside of the disciplines.
i’m not fully disagreeing with you because it sounds like you have experiences that inform your perspective. i find it interesting because my own experiences bring me in from the inverse perspective.
SubiculumCode
|root
|parent
|next
|previous
[-]
Gooblebrai
|root
|parent
|previous
[-]
It seems to me a lot of the modern "tech-bro culture" is trying to control the future and reduce uncertainty: Stop death, merge with the robotic super intelligence, colonize Mars to escape Earth inevitable decay, etc.
I'm still waiting for the startups claiming to reduce entropy or solve the false vacuum decay
mettamage
|root
|parent
|previous
[-]
But I dare to guess that most HN’ers did high school bio and that’s it, so it’s harder to even give a small thoughtful comment on it, so they refrain.
Case in point, I wouldn’t have commented either. But I feel at home here and notice some behavioral patterns. And compared to other fellow devs, I generally am more tuned to tune in on behavioral patterns because of having studied psychology.
But that’s just my take.
cloche
|root
|parent
[-]
swyx
|next
|previous
[-]
also 3 paper coauthors walked thru it with us: https://youtu.be/4g1bURdKN0Q
all this is part of the new AI for Science effort we are spinning up at Latent Space - all guidance and support would be greatly appreciated as this is a much harder domain to cover than software
tmoertel
|next
|previous
[-]
Okay, now you have my attention.
What's the deal on the company behind it? “Biohub is a 501(c)(3) biomedical research organization...” Nonprofit. Nifty!
This all sounds great, but as we have recently seen with, say OpenAI, there is nonprofit and then there is nonprofit. Anyone know which Biohub is?
trilogic
|next
|previous
[-]
I did have a bit of fun myself finetuning esm2 in domain specific bacteria (cause it gives better score) and comparing it to another model (self created) and self created beat it at 25% more accuracy. Then for the 3d structure was coded a 3d protein visualizer hypergraph with the upload file option and visualize instantly the result. 2 days job :)
rguiscard
|next
|previous
[-]
Den_VR
|next
|previous
[-]
RobotToaster
|next
|previous
[-]
Huh, appears to be actually open source, that's a pleasant surprise. Usually these academic models have some weird license attached to them.
ethanwillis
|previous
[-]
a scientific engine for prediction, design, and discovery that can map proteins across the tree of life, predict their structures, and design new protein binders that function in laboratory experiments.
So, my issue with this is just like in a lot of the other areas of bio we're not able to explore outside the semantics of what is "known." Even a simpler task of just doing proper assembly is plagued by this. De Novo assembly of an alien/novel organism mixed with samples from other alien organisms would be impossible with what we can do today. Even with things that we're familiar we struggle with metagenomic assembly.