Scaling ML embedding models to serve a billion queries

A presentation at AI DevWorld 2022 in October 2022 in San Jose, CA, USA by Senthilkumar Gopal

Scaling embedding models to serve a billion queries Senthilkumar Gopal @sengopal

Journey of a Query @ eBay 2 © 2022 eBay. All rights reserved.

Search @ eBay How can we discover items without describing them? This is a problem across many domains where search is a core functionality. Question to ponder can we provide users with the ability to “discover” through visual cues instead? 3 © 2022 eBay. All rights reserved. https://unsplash.com/photos/2oUiUu5QAys

Current Search Experience Nice Kilim Pillow for my couch! 4 © 2022 eBay. All rights reserved. Is this a kilim pillow? Or a orange kilim pillows? Perhaps a orange throw kilim pillow??

k nearest neighbours search - A thought experiment Let’s represent an item TITLE as a 2-dimensional vector 5 © 2022 eBay. All rights reserved.

So what is an embedding then? Represents Semantic Similarity Similarity (sofa, couch) A real word example [R768] 6 © 2022 eBay. All rights reserved. https://pytorch.org/tutorials/beginner/nlp/word_embeddings_tutorial.html

So what is an embedding then? You shall know a word by the company it keeps - (Firth, J. R. 1957:11) Large Language Models - GPT 3 [175 B] ○ 45 TB text data - Wikipedia and books Neural network learns word associations from a large corpus. 7 ○ Detects synonymous words. ○ Suggests words for a partial sentence. © 2022 eBay. All rights reserved. https://en.wikipedia.org/wiki/Word2vec https://ruder.io/word-embeddings-1/

So what is an embedding then? books numbers companies 8 © 2022 eBay. All rights reserved. http://web.stanford.edu/class/cs224n/slides/cs224n-2022-lecture02-wordvecs2.pdf 8

So what is an embedding then? 9 © 2022 eBay. All rights reserved. http://web.stanford.edu/class/cs224n/slides/cs224n-2022-lecture02-wordvecs2.pdf

What about an image? 10 © 2022 eBay. All rights reserved. https://en.wikipedia.org/wiki/Convolutional_neural_network

Model Architecture Multiple Modalities - Inspired by CLIP * 11 © 2022 eBay. All rights reserved. https://openai.com/blog/clip/

How do we “learn” an embedding? Text Encoder 12 © 2022 eBay. All rights reserved. R768 That’s how an embedding looks like!!! Image Encoder Devlin, Jacob, et al. “Bert: Pre-training of deep bidirectional transformers for language understanding.” arXiv preprint arXiv:1810.04805 (2018). https://en.wikipedia.org/wiki/Convolutional_neural_network

Why do we need ANN? All problems start with SCALE Exhaustive search curse of dimensionality ANN Approximate Nearest Neighbours 13 © 2022 eBay. All rights reserved. https://en.wikipedia.org/wiki/Proximity_analysis#/media/File:Euclidean_Voronoi_diagram.svg http://ann-benchmarks.com/index.html#algorithms

14 © 2022 eBay. All rights reserved. combining the elements together

How does this function? Display all inventory matching my visual appeal Quickly pivot to entire inventory using a visual first cue 15 © 2022 eBay. All rights reserved. Is this a orange throw kilim pillow?

Design Trained model 17 © 2022 eBay. All rights reserved.

Data Ingestion 18 © 2022 eBay. All rights reserved.

Data Ingestion Challenges 19
Speed vs. resource trade off
Storage
Download errors
Downstream dependencies © 2022 eBay. All rights reserved. https://unsplash.com/photos/wrrgZwI7qOY

ML Platform and Inference 20 © 2022 eBay. All rights reserved.

Cassini Indexing 21 © 2022 eBay. All rights reserved. The Architecture of eBay Search Trotman, Andrew, Jon Degenhardt, and Surya Kallumadi - eCOM@ SIGIR. 2017.

Orchestration 22 © 2022 eBay. All rights reserved. https://unsplash.com/photos/yUJVHiYZCGQ

Workflow Orchestration using Apache Airflow Processing modes - BULK - DELTA 23 © 2022 eBay. All rights reserved. Apache Airflow Logo

Challenges with Apache Airflow Challenge Solution Multiple Spark versions Define task level parameters Multiple Docker image versions Python virtual environment packages Different platforms, zones, and network Retries, system monitoring flakiness 24 © 2022 eBay. All rights reserved. Apache Airflow Logo

A/B Testing How do we test different models in production? Trained model 25 © 2022 eBay. All rights reserved.

Data Publishing for A/B Tests using Airflow 26 © 2022 eBay. All rights reserved.

Evolution Model Drift • Seasonality • Aging of the models Actions • Metrics monitoring • Downstream evaluation • Retraining 27 © 2022 eBay. All rights reserved. Data Drift • Data Integrity • Data pipelines Actions • Fault tolerance • Monitoring of time, cpu, memory, disk

Key takeaways Similarity 28 © 2022 eBay. All rights reserved. Scalability Monitoring

Questions? slides areeBay. available at https://bit.ly/ebay-ml 29 © 2022 All rights reserved. https://unsplash.com/photos/4V1dC_eoCwg

Senthilkumar Gopal
@sengopal

1 / 30

This talk is aimed at providing a deeper insight into the scale, challenges and solutions formulated for powering embeddings based visual search in eBay. This talk walks the audience through the model architecture, application archite for serving the users, the workflow pipelines produced for building the embeddings to be used by Cassini, eBay’s search engine and the unique challenges faced during this journey. This talk provides key insights specific to embedding handling and how to scale systems to provide real time clustering based solutions for users.

Resources

The following resources were mentioned during the presentation or are useful additional information.

Emamo

Buzz and feedback

Here’s what was said about this presentation on social media.

This is happening today ☺️ @AIDevWorld
Join me in our discussion about scaling embedding models. https://t.co/F4ViDUH8Qb #MachineLearning #embeddings #scale #Mlinproduction pic.twitter.com/dYk0yXhern
— Senthilkumar Gopal (@sengopal) October 25, 2022

Scaling ML embedding models to serve a billion queries

Data Ingestion Challenges 19

Speed vs. resource trade off

Storage

Download errors

Link for this presentation:

HTML code for embedding:

Share on social media:

Resources

Emamo

Buzz and feedback