Hacker News
Model Training as Code
delichon
|next
[-]
https://youtu.be/20p5-kQXF_Q?is=72ImTNxkOEKmOXQ9
He predicts this kind of model factory will become central to organizational learning and operations. Updating and upgrading the model stack becomes the core staff function.
faangguyindia
|root
|parent
|next
[-]
But models did not become good at coding just because coding is replayable. It’s because there are countless repos, issues, Stack Overflow threads, and Reddit posts/comments/questions where a solution is clearly marked as “solved” or “that helped,” and AI can learn from that feedback.
Being replayable does play a role because a solution can be tested against a compiler, and the resulting errors or lack of errors/warnings can reveal whether it worked.
This becomes much harder in fields like fitness, where changes take much longer and cause and effect relationships are not straightforward to establish.
Your muscle gain increased but was it because you increased protein intake? Or was it because you started eating more carbs, which added more energy to the system?
Once protein needs are already met, calories may become the limiting factor. In that case, the additional gains may come primarily from increased calorie intake rather than the higher protein intake itself.
AI is bad at fitness, evidently.
Many people forget, conversation with a model also generates training data. This is how your problems, algorithms, solutions end up in training data and end up right at your competitors without your competitor trying to actively steal your code.
I simply do not expose core algorithms which improve my product to AI agents.
NitpickLawyer
|root
|parent
[-]
That's at least 2yo take. Today's gains for SotA (either closed or open models) come from RLVR 100%. The model unrolls many iterations, those iterations get verified w/ tests/known tests/rubrics and the model learns from that (grpo or similar).
And what's cool about this (and why scale really matters now) is that you can mostly get this process automated (i.e. take a known good repo, ask one agent to remove one feature, keep the tests, ask another model to add that feature back, verify that old tests work on new implementation, repeat). This is why top labs are pulling away in the breadth of their capabilities, compared to open models. It's scale, pure and simple. And the better their models become, the larger the gap due to automating better cases.
typs
|root
|parent
|next
|previous
[-]
jaggederest
|root
|parent
|previous
[-]
SpyCoder77
|next
|previous
[-]
verelo
|root
|parent
|previous
[-]
boothby
|root
|parent
[-]
pcthrowaway
|root
|parent
|next
[-]
verelo
|root
|parent
|previous
[-]
akoboldfrying
|root
|parent
|previous
[-]
jimbo_joe
|next
|previous
[-]
mschwaig
|next
|previous
[-]
Hermeticity always seems to mean isolation, but depending on who you ask it does not always mean computing some sort of hash over all build input as the 'identity' of a particular step in the pipeline, like Nix does.
If you do that hash-based identity part, looking up intermediary results and resuming from there happens using this sort of hash.
Does savanah do that, or will it resume where it left off based on a less strict notion of identity?
I could see arguments for either approach.
michaelchicory
|root
|parent
[-]
random3
|next
|previous
[-]
So they’ve built Savanah - a workflow engine because the existing zoo of hundreds of workflow engines didn’t cut it :)