The Cost of AI and Solving it Locally

A presentation at THAT Conference, WI 2023 in July 2023 in Wisconsin Dells, WI, USA by Matt Williams

Slide 1

Slide 1

The Cost of AI and how to solve it locally 1

Slide 2

Slide 2

2

Slide 3

Slide 3

  • worked for many companies - information worker - slowly evolving how we work - but recent, way we work… 3

Slide 4

Slide 4

…is experiencing… is experiencing 4

Slide 5

Slide 5

…is experiencing… …a revolution a revolution. - everything different - new people excited - foaming at mouth for next thing - not overnight - took long time to get here - lots of places to start - up until recent, it was about 5

Slide 6

Slide 6

First there was better search better search - search getting better and better - we have gotten better - at asking questions 6

Slide 7

Slide 7

We could use: And we have had a lot of options to use. Some better and some worse. 7

Slide 8

Slide 8

We could use: Google The first to get folks really excited was Google. Sure there were others, like Yahoo, but they all sucked 8

Slide 9

Slide 9

We could use: Google Bing Then there was bing. Lots of money…and…well… 9

Slide 10

Slide 10

We could use: Google Bing Duck Duck Go And for the privacy focused there was duck duck go - many other options - my favorite - neeva - rank domain importance 10

Slide 11

Slide 11

To get better answers for every question And search engines offered us help to get better answers for every question…as long as we asked a good question 11

Slide 12

Slide 12

But search engines can only go so far But they aren’t perfect. They try to find the answer to what we ask - only as good - as question we ask 12

Slide 13

Slide 13

Sometimes our question shows some bias but sometimes our question shows some bias. - what bias 13

Slide 14

Slide 14

and the answer may reflect that bias back to us and the answer reflects that bias back to us. in this case the bias is our own and we control it. Interestingly some have seen that the more educated someone is, the more bias is seen in the questions 14

Slide 15

Slide 15

Search engines also focus on finding the source of information The other big problem with search engines is that they are built to find the authoritative source of information on what your searched for 15

Slide 16

Slide 16

and not the answer to the specific question they do not answer the question, but rather point you the way to a place where you can find the answer. 16

Slide 17

Slide 17

but now there is AI but now we have AI 17

Slide 18

Slide 18

AI tools answer the question 18

Slide 19

Slide 19

Will AI replace the role of the search engine? 19

Slide 20

Slide 20

20

Slide 21

Slide 21

21

Slide 22

Slide 22

chatgpt was the first to get everyone excited about AI 22

Slide 23

Slide 23

Can it answer everyone’s questions? 23

Slide 24

Slide 24

A lot of folks think so 24

Slide 25

Slide 25

25

Slide 26

Slide 26

Matt Williams evangelist @ infra @technovangelist Welcome to the session. My name is matt Williams and I am an evangelist for InfraHQ. You can find me on all the socials like Twitter and threads and GitHub and all the others as technovangelist In this session, I want to talk about how exciting all of these brand new AI services are, but I also wanted to look at some of the problems they introduced, and at the end we will be looking at a solution to these problems, and that solution is running these models locally. It turns out there are a few different ways. And I work in a team focused on building the best option. It’s a free and open source solution called ollama. but before we get there, lets talk about the problems. 26

Slide 27

Slide 27

AI is not new My first intro to ai was intro to ai programming at FSU. The current wave of ai started with the transformers paper in [[year]] this is a long road that we are on 27

Slide 28

Slide 28

and as with any long road that road is twisty and you can’t always see whats around the corner but its getting exciting. And as with any long twisty road 28

Slide 29

Slide 29

Sometimes a rock falls right in front of you, or even crushes your car. 29

Slide 30

Slide 30

Users are asking and trusting without verification users are asking questions without context and getting crushed. I have heard from so many who think AI is a fad because it gets the answers wrong so often. But AI is like any source. You should always verify the answers you get. But not everyone wants to do that. 30

Slide 31

Slide 31

a lawyer looked for precedent & believed what he saw At some point recently, an airline passenger was hurt by the foodcart on an Avianca flight. So he sued. The lawyer researched the issue and asked chatgpt to write an affidavit with examples that showed precedent for his client. ChatGPT found 6 examples. And the lawyer submitted it. Only problem is that none of the examples existed, chatgpt made it up. It answered the question it was given. Lawyer Used ChatGPT In Court—And Cited Fake Cases. A Judge Is Considering Sanctions 31

Slide 32

Slide 32

they are even submitting their data So if context is what is needed, folks are coming up with solutions that provide context. Obsidian has a plugin called copilot, as well as others. they submit the markdown documents to help find more connections in notes 32

Slide 33

Slide 33

personal data means personalized results This is good, right. Personal data means personalized results. how could anything go wrong 33

Slide 34

Slide 34

but that can also have bad outcomes Well it did. Two engineers working at Samsung submitted information about a project they were working on internally. Someone later asked a question and learned about an internal project name at Samsung. Since then Samsung has banned the use of chatgpt. because the data in your prompts and the answers go back to helping make the model better. 34

Slide 35

Slide 35

and it can potentially skew future results but sometimes data is injected purposefully to direct users to misinformation. Data poisoning. 35

Slide 36

Slide 36

scroll down… Another issue is about who owns what you generate in AI services. One of the really great online tools is Midjourney. In fact, apart from website screenshots, every image in this deck was created in midjourney. but if you aren’t paying, then you don’t own the assets. read this. 36

Slide 37

Slide 37

37

Slide 38

Slide 38

I mentioned there may be a solution 38

Slide 39

Slide 39

its easier and harder than you think 39

Slide 40

Slide 40

the solution is local 40

Slide 41

Slide 41

If you are the type to think about going local 41

Slide 42

Slide 42

you have probably seen this… 42

Slide 43

Slide 43

this is a “simple” example server example … with 1300 lines of code. Llama.cpp was one of the first libraries created to make running Large Language Models locally possible. Most of the tools out leverage it. but it will take more than our hour to go through getting started with it. 43

Slide 44

Slide 44

We are building another tool 44

Slide 45

Slide 45

and we are learning from every other tool out there 45

Slide 46

Slide 46

Let’s look at some of the alternatives 46

Slide 47

Slide 47

some projects are not long lived 47

Slide 48

Slide 48

but first… 48

Slide 49

Slide 49

some definitions 49

Slide 50

Slide 50

LLM define it. large language model. 50

Slide 51

Slide 51

tokens tokens are the word parts 51

Slide 52

Slide 52

weights determine relationship in a multi dimensional space 52

Slide 53

Slide 53

inference, quantization, batching, attention, and more 53

Slide 54

Slide 54

54

Slide 55

Slide 55

How to use with Python • If you are already using Python, go to the next slide • If you have done anything with Python in the past but not an active Python dev, your environment is probably broken • good luck in fixing it 55

Slide 56

Slide 56

How to use with Python – slide 2 • If you already have Jupyter setup, skip to slide 3 • If you don’t, good luck • Choose an environment, conda, anaconda, miniconda, miniforge. • Find your favorite conflicting guide to install it • Find another because the first few won’t work 56

Slide 57

Slide 57

How to use Python – slide 3 • Find a random Python notebook that you can run thru and hope there are no errors. • There will be errors and yes, they are cryptic 57

Slide 58

Slide 58

How to use Python slide 4 • If you like this process… Congratulations You are a Python developer 58

Slide 59

Slide 59

An easier method • One popular solution is openplayground • Make sure you have a good python environment • pip install openplayground • Fix the errors 59

Slide 60

Slide 60

60

Slide 61

Slide 61

Now what? 61

Slide 62

Slide 62

Oobabooga text-generationwebui Started in December 62

Slide 63

Slide 63

you got your python environment working, right? 63

Slide 64

Slide 64

64

Slide 65

Slide 65

Show finding models Editing prompt editing parameters 65

Slide 66

Slide 66

DEMO 66

Slide 67

Slide 67

67

Slide 68

Slide 68

DEMO 68

Slide 69

Slide 69

Introducing Ollama 69

Slide 70

Slide 70

Goals of the project • Easy for new users • Simple run command • Simple UI (coming soon) • Powerful for developers • Not just for Python • Easy install (Apple Silicon Today, Intel, Windows and Linux soon) • The easiest distribution model • Free and open source forever 70

Slide 71

Slide 71

Using Ollama • ollama run orca • ollama run llama2 • ollama run vicuna • ollama run orca “why is the sky blue” 71

Slide 72

Slide 72

Based on Docker layers • Layer for the model weights • Layer for parameters • Layer for prompts • Layer for LORA • Layer for other functionality • Layers are reused as needed • Updating a base layer doesn’t break the model 72

Slide 73

Slide 73

Customize with Modelfile FROM llama2 73

Slide 74

Slide 74

Create and run the model ollama create amazingimages –f ./amazingimages ollama run amazingimages 74

Slide 75

Slide 75

Customize with Modelfile FROM llama2 SYSTEM “”” You are an artist with a way with words. Every prompt will be a simple idea. You will transform that simple idea into a prompt that an image generation tool like MidJourney can use to visualize the idea. You will expand on the idea, provide new and interesting details and add excitement to the description. Turn every simple idea into an explosion of amazing inspiration. Never just provide a definition of the phrase. Always produce a visual description of the idea, and prefer to describe an analogy rather than a literal description. Just output the text. Never include emojis in the output. Also include the colors of the image, and the style. Try to describe the most visually appealing image. “”” 75

Slide 76

Slide 76

Customize with Modelfile FROM llama2 PARAMETER temperature 0.9 SYSTEM “”” You are an artist with a way with words. Every prompt will be a simple idea. You will transform that simple idea into a prompt that an image generation tool like MidJourney can use to visualize the idea. You will expand on the idea, provide new and interesting details and add excitement to the description. Turn every simple idea into an explosion of amazing inspiration. Never just provide a definition of the phrase. Always produce a visual description of the idea, and prefer to describe an analogy rather than a literal description. Just output the text. Never include emojis in the output. Also include the colors of the image, and the style. Try to describe the most visually appealing image. “”” 76

Slide 77

Slide 77

DEMO 77

Slide 78

Slide 78

Matt Williams evangelist @ infra @technovangelist https://ollama.ai YouTube.com/technovangelist Welcome to the session. My name is matt Williams and I am an evangelist for InfraHQ. You can find me on all the socials like Twitter and threads and GitHub and all the others as technovangelist In this session, I want to talk about how exciting all of these brand new AI services are, but I also wanted to look at some of the problems they introduced, and at the end we will be looking at a solution to these problems, and that solution is running these models locally. It turns out there are a few different ways. And I work in a team focused on building the best option. It’s a free and open source solution called ollama. but before we get there, lets talk about the problems. 78

Slide 79

Slide 79

that.land/3NOV9zA 79