Using Retrieval Augmented Generation (RAG) with Sitecore Search to ground GPT queries
Slide 2
Rob Earlam Head of Developer Advocacy, Sitecore
Developer advocate lover music listener
software developer movie pizza eater meat smoker
living in
rob-earlam
@rob@robearlam.com
RobEarlam
@RobEarlam.com
https://robearlam.com/
Slide 3
THANK YOU to our sponsors P L AT I N U M
GOLD
COMMUNITY PLUS
COMMUNITY
Slide 4
Large Language Model (LLM) Hallucinations
• LLM Hallucination are grammatically correct but factually inaccurate, or nonsensical. • Sounds convincing, but don’t align with reality • Can lead to confusion & mistrust • Addressing hallucination is essential for building trust in AI-generated content
Slide 5
Slide 6
Slide 7
Slide 8
Slide 9
Slide 10
Slide 11
How can you get better results?
Train your own model Pretrain model based on your domainspecific data.
Slow
Expensive (10’s of millions $$)
Still point-in-time
Slide 12
How can you get better results?
Fine-tune existing model Adapt an existing model Fine-tuned models often forget or lose capabilities Reliant on the quantity and quality of training data Lacks external knowledge
Still point-in-time
Slide 13
How can you get better results? Retrieval Augmented Generation (RAG) “Grounding” queries using your domain information with existing Models. Significantly cheaper.
Retains the capabilities of existing model.
Ability to change knowledge sources. No need to retrain when data changes.
Slide 14
Retrieval Augmented Generation (RAG)
Slide 15
Sitecore Search on the Developer Portal
Slide 16
Sitecore Search on the Developer Portal Sitecore.com Discover Sitecore YouTube channel Helix Documentation Sitecore GitHub Repositories Sitecore Stack Exchange Sitecore Community Blogs Sitecore Developer Portal Sitecore Documentation Sitecore Knowledge Base Sitecore PowerShell Documentation OrderCloud Documentation Sitecore Blok
Sitecore Changelog
Slide 17
Hello Sitecore ChatBot!
• Part of internal Sitecore AI Hackathon • Today’s focus on RAG portions of the project, but much more built into this project • Personalise – Persona based tailoring of results • CDP – tracking and governance
Slide 18
What should we ask?
https://forms.office.com/e/G3SmLq3jgz
Summary • With LLM’s you need to think “probabilistically” not “deterministically” • 1 + 1 doesn’t always equal 2
• Using RAG allows you to ground queries without the need for expensive training or fine-tuning • Sitecore Search can be a great option for RAG as customers already have their data stored there.