Using Retrieval Augmented Generation (RAG) with Sitecore Search to ground GPT queries

A presentation at SUGCON India 2024 in June 2024 in Bengaluru, Karnataka, India by Rob Earlam

Slide 1

Slide 1

Using Retrieval Augmented Generation (RAG) with Sitecore Search to ground GPT queries

Slide 2

Slide 2

Rob Earlam Head of Developer Advocacy, Sitecore Developer advocate lover music listener software developer movie pizza eater meat smoker living in rob-earlam @rob@robearlam.com RobEarlam @RobEarlam.com https://robearlam.com/

Slide 3

Slide 3

THANK YOU to our sponsors P L AT I N U M GOLD COMMUNITY PLUS COMMUNITY

Slide 4

Slide 4

Large Language Model (LLM) Hallucinations • LLM Hallucination are grammatically correct but factually inaccurate, or nonsensical. • Sounds convincing, but don’t align with reality • Can lead to confusion & mistrust • Addressing hallucination is essential for building trust in AI-generated content

Slide 5

Slide 5

Slide 6

Slide 6

Slide 7

Slide 7

Slide 8

Slide 8

Slide 9

Slide 9

Slide 10

Slide 10

Slide 11

Slide 11

How can you get better results? Train your own model Pretrain model based on your domainspecific data. Slow Expensive (10’s of millions $$) Still point-in-time

Slide 12

Slide 12

How can you get better results? Fine-tune existing model Adapt an existing model Fine-tuned models often forget or lose capabilities Reliant on the quantity and quality of training data Lacks external knowledge Still point-in-time

Slide 13

Slide 13

How can you get better results? Retrieval Augmented Generation (RAG) “Grounding” queries using your domain information with existing Models. Significantly cheaper. Retains the capabilities of existing model. Ability to change knowledge sources. No need to retrain when data changes.

Slide 14

Slide 14

Retrieval Augmented Generation (RAG)

Slide 15

Slide 15

Sitecore Search on the Developer Portal

Slide 16

Slide 16

Sitecore Search on the Developer Portal Sitecore.com Discover Sitecore YouTube channel Helix Documentation Sitecore GitHub Repositories Sitecore Stack Exchange Sitecore Community Blogs Sitecore Developer Portal Sitecore Documentation Sitecore Knowledge Base Sitecore PowerShell Documentation OrderCloud Documentation Sitecore Blok Sitecore Changelog

Slide 17

Slide 17

Hello Sitecore ChatBot! • Part of internal Sitecore AI Hackathon • Today’s focus on RAG portions of the project, but much more built into this project • Personalise – Persona based tailoring of results • CDP – tracking and governance

Slide 18

Slide 18

What should we ask? https://forms.office.com/e/G3SmLq3jgz

Slide 19

Slide 19

RAG with Sitecore Search Sitecore.com Sitecore Changelog Sitecore Developer Portal Sitecore Developer Portal Azure OpenAI

Slide 20

Slide 20

Summary • With LLM’s you need to think “probabilistically” not “deterministically” • 1 + 1 doesn’t always equal 2 • Using RAG allows you to ground queries without the need for expensive training or fine-tuning • Sitecore Search can be a great option for RAG as customers already have their data stored there.

Slide 21

Slide 21

Thank you