Improving Machine Learning with Human Feedback

A presentation at PyData Berlin in April 2023 in Berlin, Germany by Erin Mikail Staples

Improving Machine Learning from Human Feedback Erin Mikail Staples + Nikolai Lubimov PyData DE 2023

Erin Mikail Staples (she/her) Sr. Developer Community Advocate Empowers the open source community through education, collaboration, and content creation. Nikolai Liubimov (he/him) CTO Helps customers debug and adopt label studio usage best practices

Large Foundational Models have hit the cultural zeitgeist

We will not be creating Terminator here.

These large generative models are better with a human signal.

Internet-trained models bring with them internet-scaled biases.

biases social problems poor data quality limited applications

Reinforcement Learning with Human Feedback helps to adjust for problems that tend to come with large-scale foundational models.

Reinforcement Learning Goal-oriented model that seeks to identify similar actions or sequence of actions that would maximize future rewards. Able to select the best output among a series of outputs.

Unsupervised Learning and Prompt Engineering focuses on adapting to an existing model’s limitations.

Known limitations include: - Harmful Speech - Overgeneralized Data - Out-of-Date Data
Contain racial, gender, and religious biases - Require large computational resources

Reinforcement Learning focuses on optimizing for the end goal by adapting the model itself to new and possibly uncertain information based on a human signal.

With RLHF one can align model output with one’s specific needs while reducing bias at a fraction of the original training cost.

BLOOM - ChatAlpaca - OpenLlama - CasperAI/TRLX - PyTorch - InstructGOOSE - Label Studio - Hugging Face

We’re already seeing RLHF used in the wild

The Importance of the Reward (Preference) Models

We know what this looks like theoretically…

… now let’s demonstrate this in real time.

See it in action! https://github.com/heartexlabs/RLHF

RLHF relies on social engineering and data integrity as much as it does technical skill.

Keeping annotators well-informed and motivated

Try out RLHF for yourself. ➡ @erinmikail @liubimovnik @labelstudioHQ community@labelstud.io https://labelstud.io/pydata-berlin

Erin Mikail Staples
@erinmikail

1 / 38

Internet-trained models bring with them internet-scaled biases.

Thanks to the power of Reinforcement Learning with Human Feedback(RLHF), we can now adjust for problems that tend to come with large-scale foundational models.

In the talk by Erin Mikail Staples and Nikolai Liubimov presented at PyData Berlin 2023, they shared not only why RLHF is an excellent solution to improving on existing large models but also how it works.

RHLF is currently being used in the wild in projects like OpenAI and BloombergGPT to build specific use-case-driven adaptations of large foundational models.

Video

Resources

The following resources were mentioned during the presentation or are useful additional information.

GitHub Repo and Demo

Improving Machine Learning with Human Feedback

Link for this presentation:

HTML code for embedding:

Share on social media:

Internet-trained models bring with them internet-scaled biases.

Video

Resources

GitHub Repo and Demo