Improving Machine Learning with Human Feedback

A presentation at PyData Berlin in April 2023 in Berlin, Germany by Erin Mikail Staples

Slide 1

Slide 1

Improving Machine Learning from Human Feedback Erin Mikail Staples + Nikolai Lubimov PyData DE 2023

Slide 2

Slide 2

Erin Mikail Staples (she/her) Sr. Developer Community Advocate Empowers the open source community through education, collaboration, and content creation. Nikolai Liubimov (he/him) CTO Helps customers debug and adopt label studio usage best practices

Slide 3

Slide 3

Large Foundational Models have hit the cultural zeitgeist

Slide 4

Slide 4

Slide 5

Slide 5

We will not be creating Terminator here.

Slide 6

Slide 6

These large generative models are better with a human signal.

Slide 7

Slide 7

Why does this matter?

Slide 8

Slide 8

Bigger ≠ Better

Slide 9

Slide 9

Internet-trained models bring with them internet-scaled biases.

Slide 10

Slide 10

biases social problems poor data quality limited applications

Slide 11

Slide 11

Slide 12

Slide 12

Power of Reinforcement Learning

Slide 13

Slide 13

Slide 14

Slide 14

Reinforcement Learning with Human Feedback helps to adjust for problems that tend to come with large-scale foundational models.

Slide 15

Slide 15

Reinforcement Learning Goal-oriented model that seeks to identify similar actions or sequence of actions that would maximize future rewards. Able to select the best output among a series of outputs.

Slide 16

Slide 16

Unsupervised Learning and Prompt Engineering focuses on adapting to an existing model’s limitations.

Slide 17

Slide 17

Known limitations include: - Harmful Speech - Overgeneralized Data - Out-of-Date Data

  • Contain racial, gender, and religious biases - Require large computational resources

Slide 18

Slide 18

Reinforcement Learning focuses on optimizing for the end goal by adapting the model itself to new and possibly uncertain information based on a human signal.

Slide 19

Slide 19

With RLHF one can align model output with one’s specific needs while reducing bias at a fraction of the original training cost.

Slide 20

Slide 20

  • BLOOM - ChatAlpaca - OpenLlama - CasperAI/TRLX - PyTorch - InstructGOOSE - Label Studio - Hugging Face

Slide 21

Slide 21

We’re already seeing RLHF used in the wild

Slide 22

Slide 22

So how did they do it?

Slide 23

Slide 23

Slide 24

Slide 24

Slide 25

Slide 25

Slide 26

Slide 26

Slide 27

Slide 27

The Importance of the Reward (Preference) Models

Slide 28

Slide 28

Preventing Unwanted Model Drift

Slide 29

Slide 29

Final Stages of Model Development

Slide 30

Slide 30

Ready for Production

Slide 31

Slide 31

We know what this looks like theoretically…

Slide 32

Slide 32

… now let’s demonstrate this in real time.

Slide 33

Slide 33

See it in action! https://github.com/heartexlabs/RLHF

Slide 34

Slide 34

Problems with RLHF

Slide 35

Slide 35

Humans ruin everything.

Slide 36

Slide 36

RLHF relies on social engineering and data integrity as much as it does technical skill.

Slide 37

Slide 37

Keeping annotators well-informed and motivated

Slide 38

Slide 38

Try out RLHF for yourself. ➡ @erinmikail @liubimovnik @labelstudioHQ community@labelstud.io https://labelstud.io/pydata-berlin