Improving Machine Learning with Human Feedback

A presentation at PyData Berlin in in Berlin, Germany by Erin Mikail Staples

Internet-trained models bring with them internet-scaled biases.

Thanks to the power of Reinforcement Learning with Human Feedback(RLHF), we can now adjust for problems that tend to come with large-scale foundational models.

In the talk by Erin Mikail Staples and Nikolai Liubimov presented at PyData Berlin 2023, they shared not only why RLHF is an excellent solution to improving on existing large models but also how it works.

RHLF is currently being used in the wild in projects like OpenAI and BloombergGPT to build specific use-case-driven adaptations of large foundational models.

Video

Resources

The following resources were mentioned during the presentation or are useful additional information.