Implementing an ecommerce product search

A presentation at Search Meetup Hamburg in August 2019 in Hamburg, Germany by Alexander Reelsen

Slide 1

Slide 1

Finding e-commerce products using Elasticsearch Alexander Reelsen | @spinscale alex@elastic.co

Slide 2

Slide 2

search is hard!

Slide 3

Slide 3

search in ecommerce is harder

Slide 4

Slide 4

good data & good searches

Slide 5

Slide 5

bad data & smart searches

Slide 6

Slide 6

good data & worst searches

Slide 7

Slide 7

Agenda

Slide 8

Slide 8

Agenda facetted navigation

Slide 9

Slide 9

Agenda facetted navigation search bar

Slide 10

Slide 10

Agenda facetted navigation clean data search bar

Slide 11

Slide 11

Agenda facetted navigation clean data smart searches search bar

Slide 12

Slide 12

Agenda facetted navigation clean data synonyms smart searches search bar

Slide 13

Slide 13

Agenda clean data facetted navigation UOM smart searches search bar synonyms

Slide 14

Slide 14

Agenda facetted navigation clean data decompounding synonyms UOM smart searches search bar

Slide 15

Slide 15

Agenda facetted navigation clean data relevancy UOM smart searches synonyms decompounding search bar

Slide 16

Slide 16

Agenda facetted navigation clean data variants UOM relevancy smart searches synonyms decompounding search bar

Slide 17

Slide 17

Agenda facetted navigation clean data variants deduplication synonyms UOM relevancy smart searches decompounding search bar

Slide 18

Slide 18

Agenda facetted navigation deduplication clean data variants search as you type UOM relevancy smart searches synonyms decompounding search bar

Slide 19

Slide 19

Agenda facetted navigation deduplication UOM clean data variants analytics relevancy smart searches search as you type synonyms decompounding search bar

Slide 20

Slide 20

Agenda analytics facetted navigation deduplication UOM clean data variants data quality relevancy smart searches search as you type synonyms decompounding search bar

Slide 21

Slide 21

Agenda analytics facetted navigation deduplication UOM clean data variants mobile relevancy smart searches search as you type synonyms data quality decompounding search bar

Slide 22

Slide 22

Agenda analytics facetted navigation deduplication clean data variants product detail pagedata quality UOM mobile relevancy smart searches search as you type synonyms decompounding search bar

Slide 23

Slide 23

Agenda analytics facetted navigation deduplication UOM product detail page mobile clean data variants LTR relevancy smart searches search as you type synonyms data quality decompounding search bar

Slide 24

Slide 24

Agenda analytics facetted navigation deduplication product detail page clean data variants multi language UOM mobile relevancy smart searches search as you type synonyms data quality decompounding search bar LTR

Slide 25

Slide 25

Agenda analytics facetted navigation deduplication multi language UOM mobile product detail page clean data variants ETIME relevancy smart searches search as you type synonyms data quality decompounding search bar LTR

Slide 26

Slide 26

demo

Slide 27

Slide 27

search bar

Slide 28

Slide 28

search bar

Slide 29

Slide 29

smart searches

Slide 30

Slide 30

smart searches nike running hoodie xl

Slide 31

Slide 31

smart searches nike running hoodie xl

Slide 32

Slide 32

smart searches brand } } nike running hoodie xl size

Slide 33

Slide 33

clean data

Slide 34

Slide 34

clean data » Hardest thing to do ever » Formats being accepted? JSON, XML, CSV, EDIFACT? » How to train merchants? » Another local cleansing step? Accountability on failure? » If you fail here, stop optimising your search! » indexing pipeline: applying synonyms?

Slide 35

Slide 35

synonyms

Slide 36

Slide 36

synonyms » topf => kochtopf » naik => nike » portmonee => geldbörse » who maintains this list? » who keeps it updated? » who matches this against your worst queries, that return 0 hits? » reloadable without index closing (since ES 7.3)

Slide 37

Slide 37

UOM

Slide 38

Slide 38

UOM » Unit of Measure (100cm vs. 1m) » Requires normalization: part of data cleansing » Dissecting into a base unit and a value in order to query » Who is doing this already? » JSR 385: Units of Measurement API 2.0 » Could be done in an Ingest Processor

Slide 39

Slide 39

decompounding

Slide 40

Slide 40

decompounding

Slide 41

Slide 41

decompounding

Slide 42

Slide 42

decompounding

Slide 43

Slide 43

relevancy

Slide 44

Slide 44

relevancy » relevancy needs to be defined by the business owners (who rarely understand it) » BM25 is not the score you are looking for » need to incorporate business/product metrics » provision, item on stock, location, free shipping, last sale

Slide 45

Slide 45

relevancy » Search for ‘bicycle’ » Are 20 different bikes relevant results? » What about locks, lights, clothes? Maybe go with 10 bikes, 3 accessoires? » User bought a bike three months ago, maybe he is searching for equipment? Or a replacement tire?

Slide 46

Slide 46

relevancy » are there certain products you always want to score higher?

Slide 47

Slide 47

relevancy

Slide 48

Slide 48

variants

Slide 49

Slide 49

variants

Slide 50

Slide 50

variants » how to model variants and their differences? » just attributes? and price? product title and description? » search: across all variants or the main products? » display: variants as own results or group them? » display: what happens when one product is out of stock?

Slide 51

Slide 51

variants

Slide 52

Slide 52

deduplication

Slide 53

Slide 53

deduplication » Safe: ISBN, ASIN » Unsafe: Product images, description, name, release date, size » query time or index time?

Slide 54

Slide 54

deduplication

Slide 55

Slide 55

search as you type

Slide 56

Slide 56

search as you type » “The importance of seach-as-you-type cannot be overstated” » Hint: make a user test first. There are users who do not look up when typing! » Speed is key » Rank your suggestions on your own criteria! » Ensure exact hits are scored up (brown fox vs. brown foxes) » Steer the user without showing any search results » Possibly an own index with reduced result set » Analyze searches and adapt to follow trends

Slide 57

Slide 57

search as you type

Slide 58

Slide 58

analytics

Slide 59

Slide 59

analytics » conversion rate » search results with zero hits » “one search and out” » busiest hours (planning downtime) » recommendations

Slide 60

Slide 60

product detail page

Slide 61

Slide 61

product detail page » crucial to make a sale » what to display, if the product is out of stock » what to display, if the product is EOL? » dynamic price calculation

Slide 62

Slide 62

LTR

Slide 63

Slide 63

LTR

Slide 64

Slide 64

summary

Slide 65

Slide 65

summary » ecommerce search is complex » so many things to take into account… » untold: index strategies, updates, management » always have a middleware (UI, query injection, a/b testing, landing pages, redirects, query logging, business owner endpoint)

Slide 66

Slide 66

search ui https://github.com/elastic/search-ui

Slide 67

Slide 67

Elastic App Search

Slide 68

Slide 68

Elastic App Search

Slide 69

Slide 69

Elastic App Search https://www.elastic.co/blog/elastic-app-search-7-2-0-released

Slide 70

Slide 70

books

Slide 71

Slide 71

books

Slide 72

Slide 72

books

Slide 73

Slide 73

links

Slide 74

Slide 74

links » https://project-a.github.io/on-site-search-design-patterns-for-e-commerce/

Slide 75

Slide 75

Thank you for listening! Alexander Reelsen @spinscale alex@elastic.co