Hacker News

Reinforcement Learning from Human Feedback

93 points by onurkanbkrc ago | 5 comments

dang |next [-]

Related. Others?

RLHF Book - https://news.ycombinator.com/item?id=42902936 - Feb 2025 (37 comments)

verdverm |next |previous [-]

Last time I saw Nathan say something about the book, he's actively working on the next version and looking for feedback, check his socials

leggerss |root |parent [-]

You could say he's also learning from human feedback

klelatti |next |previous [-]

Web version with links, etc:

https://rlhfbook.com/

dang |root |parent [-]

Thanks! We've switched to that above from https://arxiv.org/abs/2504.12501, and put the latter in the toptext.

iisweetheartii |previous [-]

[dead]