Harnessing RAG with Vector Databases

Slide 1

Harnessing RAG with Vector Databases A Practical Guide.

Slide 2

A a d n e g

Know Your Host 2. What are Vector DB? 3. Understanding RAG 4. Live Demo time!

Slide 3

Know Your Host 👨‍💻 Cloud Captain @AWS Cloud Club JECRC • Building @Rel-InDev • 100k+ Impressions • Technical Writer • Developer Advocate • Passionate about Cloud, Content, Community, and Empathy 🥑 💬 👀 I’m a DevOps and Cloud Advocate with a passion for open-source and community building. I help people get into Cloud — Been involved in the DevOps and Cloud Landscape for over 2 years and have strong experience in Public Speaking, Technical Writing, Content Creation, Community Management/Leadership, etc. Hi there! I’m Manvendra!

Slide 4

Let’s know each other… Who’s your Favorite Artist ? What’s your favorite film ? Be Concise What’s your favorite Habit ? What’s your favorite Book ? Be Confident Be Chill ICE BREAKER

Slide 5

Traditional Databases Explained by Kitty Find me songs like the Cloud.

Slide 6

Traditional Databases Explained by Kitty Here’s the track ‘Clouds’ by XYZ Artist

Slide 7

I meant songs that feel like clouds … soft

Slide 8

VECTOR DATABASE Explained by Kitty Find me songs like the Cloud.

Slide 9

VECTOR DATABASE Explained by Kitty Here’s a playlist: lofi beats, chillhop,

Slide 10

Slide 11

What is 1 Vector DB ? 2 A vector database stores and retrieves vector embeddings for fast similarity search and semantic analysis. It supports CRUD operations, metadata filtering, scalability, and serverless capabilities. These databases are essential for AI applications involving large language models and generative AI. LLM Vector DataBase

Slide 12

Why do we need Vectro DB? STATIC DATABASE 1 Store structured data (numbers, text, rows, columns). 2 Retrieval = exact matches (SQL queries, key lookups). 3 Works great for: banking records, inventory, and user profiles. 4 Problem: Can’t understand the Semantics (meaning) between items. VECTOR DATABASE Store vector embeddings (numerical representations of meaning). Retrieval = similarity search (find “closest” vectors). Works great for: Semantic search (find docs with the same meaning). Recommendations (similar products, songs, movies). RAG for LLMs (find relevant context fast). Strength: Handles unstructured data (text, images, audio) by comparing meaning, not just exact words.

Slide 13

ARCHITECTURE WORKING →

Embed Data Use an embedding model to convert content into vectors and store them in the vector DB (linked to original content). 2. Embed Query Convert the user’s query into a vector using the same model. 3. Similarity Search Database finds the closest vectors and returns the associated original content. → →

Slide 14

ALOGRITHM HOW DOES IT WORK HNSW builds a hierarchical graph of vector clusters. Nodes connect to other nodes with similar vectors. The hierarchy allows both broad exploration at higher levels and fine search at lower levels. During a query, the graph is traversed step by step toward the most relevant vectors. ANN

Slide 15

SERVERLESS VECTOR WORKING Serverless = Next Generation: Firstgen vector DBs are fast and scalable but expensive; serverless models focus on cost-efficiency and elasticity. Separation of Storage & Compute: Uses partitioning to query only relevant parts of the index, lowering cost and latency. Multitenancy: Supports multiple namespaces without letting rarely used data inflate costs. Freshness: Ensures newly inserted vectors become searchable within seconds, even at scale

Slide 16

AWS S3 VECTOR WORKING Storage Backbone: Uses Amazon S3 as the main storage layer for vectors, making it cheap, durable, and highly scalable. Separation of Compute & Storage: Index data lives in S3, while compute nodes are spun up only when queries happen costefficient. Partitioned Search: Vectors in S3 are split into subindices/partitions, so queries only scan the most relevant sections. →