Big Game Theory Hunting: The Peculiarities of Human Behavior in the InfoSec Game

A presentation at Black Hat USA in July 2017 in Las Vegas, NV, USA by Kelly Shortridge

Slide 1

BIG GAME THEORY HUNTING THE PECULIARITIES OF HUMAN BEHAVIOR IN THE INFOSEC GAME Kelly Shortridge (@swagitda_) Black Hat 2017

Slide 2

I’m Kelly

Slide 3

This is game theory

Slide 4

It’s time for hunting some game theory 4

Slide 5

Do you believe bug-free software is a reasonable assumption? 5

Slide 6

Do you believe wetware is more complex than software? 6

Slide 7

Traditional Game Theory relies on the assumption of bug-free wetware 7

Slide 8

Behavioral Game Theory assumes there’s no such thing as bug-free 8

Slide 9

“Think how hard physics would be if particles could think” —Murray Gell-Mann 9

Slide 10

“Amateurs study cryptography, professionals study economics” —Dan Geer quoting Allan Schiffman 10

Slide 11

This is what you’ll learn: 11

Slide 12

Why traditional game theory isn’t even a theory and is unfit for strategy-making
A new framework for modeling the infosec game based on behavioral insights 3. New defensive strategies that exploit your adversaries’ “thinking” and “learning” 12

Slide 13

Let’s go hunting to find out why 13

Slide 14

I. What is Game Theory?

Slide 15

tl;dr – game theory is a mathematical language used to describe scenarios of conflict and cooperation 15

Slide 16

Game theory is more about language than theory Use it as a engendering tool, not as something to dictate optimal strategies 16

Slide 17

GT applies whenever actions of players are interdependent Strategic scenarios include many types of games, with different “solutions” for each 17

Slide 18

Zero Sum Games Lou Levit 18

Slide 19

Non-Zero Sum Games 19

Slide 20

Negative Sum Games 20

Slide 21

Positive Sum Games 21

Slide 22

Complete vs. Incomplete Information 22

Slide 23

Perfect vs. Imperfect Information Paul Green 23

Slide 24

Information Symmetry vs. Asymmetry Dakota Corbin 24

Slide 25

Defender Attacker Defender Games

Slide 26

Sequential games in which sets of players are attackers and defenders Assumes people are risk-neutral & attackers want to be maximally harmful 26

Slide 27

First move = defenders choosing a defensive investment plan Second move = attackers observe the defensive preparations & choose an attack plan 27

Slide 28

Nash Equilibrium is often used to “solve” games. This is bad. 28

Slide 29

Nash Equilibrium = optimal outcome of a noncooperative game Players are making the best decisions for themselves while taking their opponent’s decisions into account 29

Slide 30

Prisoner’s Dilemma Player 1 Player 2 Confess Refuse Confess -2, -2 0, -4 Refuse -4, 0 -1, -1 30

Slide 31

Nash Equilibrium is based on a priori reasoning Assumes rational, all-knowing players Assumes others’ decisions don’t affect you 31

Slide 32

People have applied Nash Equilibrium to infosec over the years… 32

Slide 33

Defender should play extremely fast so the attacker drops out of the game Better to invest in security than not invest, regardless of attacker strategy (wow!) Just apply tons of mathematical equations! 33

Slide 34

II. New defensive framework

Slide 35

Use GT for its expressive power in describing a framework for the infosec game Look at data outside GT, e.g. from experiments in domains similar to infosec, to select correct assumptions 35

Slide 36

What game is infosec? Ruben Bagues

Slide 37

DAD game (continuous defense & attack) Non-zero-sum Incomplete, imperfect, asymmetrical information Sequential / dynamic 37

Slide 38

This is a (uniquely?) tricky game 38

Slide 39

Have you heard infosec described as a “cat and mouse” game before? 39

Slide 40

Traditional Game Theory doesn’t allow for those… …or most of the characteristics of the “infosec game” 40

Slide 41

Assumes people are rational (they aren’t) Assumes static vs. dynamic environments Can’t ever be “one step ahead” of your adversary Deviations from Nash Equilibrium are common 41

Slide 42

“I feel, personally, that the study of experimental games is the proper route of travel for finding ‘the ultimate truth’ in relation to games as played by human players” —John Nash 42

Slide 43

Behavior-based framework Dayne Topkin

Slide 44

Experimental – how do people actually behave? People predict their opponent’s moves by either “thinking” or “learning” 44

Slide 45

Thinking = modeling how opponents are likely to respond 45

Slide 46

Our brains work like volatile memory 46

Slide 47

Working memory is a hard constraint for human thinking Enumerating steps past the next round is hard Humans kinda suck at recursion 47

Slide 48

Learning = predicting how players will act based on prior games / rounds 48

Slide 49

Humans learn through “error-reinforcement learning” (trial & error) People have “learning rates,” how much experiences factor into decision making Dopamine neurons encode errors 49

Slide 50

Veksler & Buchler study 200 consecutive “security games” across 4 strategies Different learning rates for attackers Tested # of prevented attacks for each strategy 50

Slide 51

Fixed strategy = prevent 10% - 25% of attacks Game Theory strategy = prevent 50% of attacks Random strategy = prevent 49.6% of attacks Cognitive Modeling strategy = prevent between 61% - 77% 51

Slide 52

Don’t be replaced by a random SecurityStrategyTM algorithm 52

Slide 53

III. Implementation 53

Slide 54

SWOT Analysis
Thinking Exploitation 3. Learning Exploitation 4. Minimax
Looking Ahead 54

Slide 55

SWOT Analysis Scott Webb

Slide 56

101: Traditional SWOT Strengths, Weaknesses, Opportunities, Threats 56

Slide 57

Model SWOT for yourself in relation to your adversary Model SWOT for your adversary in relation to you 57

Slide 58

“The primary insight of GT is the importance of focusing on others – of putting yourself in the shoes of other players and trying to play out all the reactions…as far ahead as possible” – Adam Brandenburger 58

Slide 59

Strengths Weaknesses ▪ Understanding of target environment ▪ Inadequate budget ▪ Motivation to not be breached ▪ Limited employee training ▪ Lack of personnel Opportunities Threats ▪ Leverage new tech to allow for tear up/down ▪ Attackers can use new tech for scalability ▪ Increased board attention to get budget ▪ Hard to keep up with pace of new attack surface 59

Slide 60

201: Perceptual SWOT 60

Slide 61

For you and your adversary, consider: How can the strengths be weaknesses? How can the weaknesses be strengths? 61

Slide 62

Self vs. Other Reality Strength Perception Threat Opportunity 62 Strength Weakness Weakness Strength Weakness Perception Strength Reality Weakness Opportunity Threat

Slide 63

“Core rigidities” = deeply embedded knowledge sets that create problems Compliance, fixed security guidelines Top management can be the wrong people for an evolving environment 63

Slide 64

Attacker strength = having time to craft an attack Leverage that “strength” with strategies leading down rabbit holes and wasting their time 64

Slide 65

Attacker strength = access to known vulns Confuse them with fake architecture so they can’t be certain what systems you’re running 65

Slide 66

Thinking Exploitation

Slide 67

Thinking strategy: belief prompting Increase players thinking by one step 67

Slide 68

“Prompt” the player to consider who their opponents are & how their opponents will react Model assumptions around capital, time, tools, risk aversion 68

Slide 69

Your goal is to ask, “if I do X, how will that change my opponent’s strategy?” 69

Slide 70

A generic belief prompting guide: 70

Slide 71

How would attackers pre-emptively bypass the defensive move? What will the opponent do next in response? Costs of the opponent’s offensive move? Probability the opponent will conduct the move? 71

Slide 72

Example: A skiddie lands on one of our servers, what do they do next? 72

Slide 73

Perform local recon, escalate to whatever privs they can get Counter: priv separation, don’t hardcode creds Leads to: attacker must exploit server, risk = server crashes 73

Slide 74

Decision Tree Modelling Jessica Furtney

Slide 75

Model decision trees both for offense & defense Theorize probabilities of each branch’s outcome Creates tangible metrics to deter selfjustification 75

Slide 76

“Attackers will take the least cost path through an attack graph from their start node to their goal node” – Dino Dai Zovi, “Attacker Math” 76

Slide 77

25% 65% Reality Criminal Group Skiddies / Random 60% Priv Separation 0% Nation State 100% 10% 50% 50% Known exploit 30% Role sep 0% 60% Use DB on box 0% Win 40% Absorb into 25% botnet Tokenization, 70% segmentation 40% Anomaly detection 0% Win 77 98% 50% seccomp 10% 1day GRSec 5% 2% Elite 0day

Slide 78

Which of your assets do attackers want? 2. What’s the easiest way attackers get to those assets? 3. What countermeasures are on that path? 4. What new path will the attacker take given #3? 5. Repeat 1 – 4 until it’s “0day all the way down”
Assign rough probabilities 78

Slide 79

Whiteboards + camera snaps (or “DO NOT ERASE!!!!”) Draw.io, Gliffy (plugs into Confluence) Google Docs (> insert drawing) PowerPoint (what I used) Visio (last resort) 79

Slide 80

Decision trees help create a feedback loop to refine strategy 80

Slide 81

Decision trees help for auditing after an incident & easy updating Serves as a historical record to refine decisionmaking process Mitigates “doubling down” effect by showing where strategy failed 81

Slide 82

Defender’s advantage = knowing the home turf Visualize the hardest path for attackers – how can you force them onto to that path? Commonalities on trees = which strategies mitigate the most risk across various attacks 82

Slide 83

Make decision trees the new “nice report” #WOCinTech

Slide 84

A new request for your pen-testers / red-team The ask: outline which paths they did or didn’t take, and why (a decision tree w/ explanations) Helps you see the attacker perspective of your defenses & where to improve 84

Slide 85

Learning Exploitation

Slide 86

Information asymmetry exploitation – disrupt attacker learning process Learning rate exploitation – introduce unreliability and pre-empt moves 86

Slide 87

Exploit the fact that you understand the local environment better than attackers 87

Slide 88

Falsifying Data Braydon Anderson

Slide 89

Defenders have info adversaries need to intercept Dynamic envs = frequently in learning phase Hide or falsify data on the legitimate system side 89

Slide 90

Macron Case Study Soroush Karimi

Slide 91

Allegedly used phishing tarpitting Signed onto phishing pages & planted bogus creds and info Obvious fakes in dumped documents 91

Slide 92

#wastehistime2016…but for hackers 92

Slide 93

Goal is to remove the attacker’s scientific method so they can’t test hypotheses (Pretend like hashtags are a thing and tweet #wastehackertime2017 with your own ideas) 93

Slide 94

Create custom email rejection messages Create honeydoc on the “Avallach Policy” Have response to suspicious emails be, “This violates the Avallach policy” Track when the doc is accessed 94

Slide 95

General strategy: create honeytokens that look to describe legitimate policies or technologies that would be useful in attacker recon 95

Slide 96

Non-Determinism Candice Seplow

Slide 97

Different behaviors at different times Can’t expect same result every time 97

Slide 98

ASLR is a non-deterministic feature, but highly deterministic in that it always works the same I want to amplify and extend it to higher levels 98

Slide 99

Raise costs at the very first step of the attack: recon Make the attacker uncertain of your defensive profile and environment 99

Slide 100

Attackers now design malware to be VM-aware #WOCinTech 100

Slide 101

Good: Make everything look like a malware analyst’s sandbox Better: Make everything look like a different malware analyst’s sandbox each time 101

Slide 102

Put wolfskins on the sheep

Slide 103

Mix & match hollow but sketchylooking artifacts on normal, physical systems 103

Slide 104

RocProtect-v1 – https://github.com/fr0gger/RocProtect-V1 Emulates virtual artifacts onto physical machine (see Unprotect Project as well) 104

Slide 105

VMwareServices.exe VBoxService.exe Vmwaretray.exe VMSrvc.exe vboxtray.exe ollydbg.exe wireshark.exe fiddler.exe \.\pipe\cuckoo cuckoomon.dll dbghelp.dll Mac addresses: “00:0C:29”, “00:1C:14”, “00:50:56”, “00:05:69” 105

Slide 106

system32\drivers\VBoxGuest.sys system32\drivers\VBoxMouse.sys HKLM\SOFTWARE\Oracle\VirtualBox Guest Additions C:\cuckoo, C:\IDA Program Files\Vmware 106

Slide 107

Make the IsDebuggerPresent function call always return non-zero Create fake versions of driver objects like .\NTICE and .\SyserDbgMsg Set KdDebuggerEnabled to 0x03 107

Slide 108

Load DLLs from AV engines using a Windows loader with a forwarder DLL ex64.sys (Symantec) McAVSCV.DLL (McAfee) SAUConfigDLL.dll (Sophos) cbk7.sys (Carbon Black) cymemdef.dll (Cylance) CSAgent.sys (Crowdstrike) 108

Slide 109

Deploy lightest weight hypervisor possible for added “wolfskin” https://github.com/asamy/ksm https://github.com/ionescu007/SimpleVisor https://github.com/Bareflank/hypervisor 109

Slide 110

Minimax Mike Wilson 110

Slide 111

Minimax / maximin = minimize the possible loss for a worst case maximum loss scenario 111

Slide 112

Want to find the minimum of the sum of the expected cost of protection and expected cost of non-protection 112

Slide 113

Don’t have a monoculture – diversity is strongly beneficial for protection Stochastic decisions may be better than deterministic From The Imitation Game: should only act on Enigma info some of the time, not all 113

Slide 114

Looking Ahead

Slide 115

Fluctuating infrastructure using emerging tech in “Infrastructure 3.0” Netflix’s Chaos Monkey https://github.com/Netflix/SimianArmy/wiki/Cha os-Monkey 115

Slide 116

Modelling attacker cognition via model tracing Prerequisite: how to begin observing attacker cognition 116

Slide 117

Preferences change based on experience Models incorporate the “post-decision-state” Higher the attacker’s learning rate, easier to predict their decisions 117

Slide 118

ΔUA = α (R – UA), where: ▪ UA = expected utility of an offensive action ▪ α = learning rate ▪ R = feedback (success / failure) 118

Slide 119

If α = 0.2, R = 1 for win & -1 for loss, then: ▪ ΔUA = 0.2(1- 0) = 0.2 Attacker is 20% more likely to do this again From here, you can adjust the learning rate based on data you see 119

Slide 120

Track utility values for each attacker action For detected / blocked actions, attacker action & outcome are known variables (so utility is calculable) Highest “U” = action attacker will pursue 120

Slide 121

IV. Conclusion 121

Slide 122

It is no longer time for some Game Theory 122

Slide 123

In fact, we’ve learned that GT is a language, not even a theory 123

Slide 124

Start with a SWOT analysis to gain perspective 124

Slide 125

Use thinking exploitation to improve threat modelling 125

Slide 126

Use learning exploitation to beleaguer your adversaries 126

Slide 127

Let’s work together to build strategies based on this behavioral framework 127

Slide 128

Next step – how to begin modeltracing attackers After that – predict attacker behavior 128

Slide 129

Try these at home – make your blue team empirical Worst case, random strategies beat fixed ones & are just as good as GT 129

Slide 130

“Good enough is good enough. Good enough always beats perfect.” —Dan Geer 130

Slide 131

Suggested reading ▪ David Laibson’s Behavioral Game Theory lectures @ Harvard ▪ “Game Theory: A Language of Competition and Cooperation,” Adam Brandenburger ▪ “Advances in Understanding Strategic Behavior,” Camerer, Ho, Chong ▪ “Know Your Enemy: Applying Cognitive Modeling in the Security Domain,” Veksler, Buchler ▪ “Know Your Adversary: Insights for a Better Adversarial Behavioral Model,” Abbasi, et al. ▪ “Deterrence and Risk Preferences in Sequential Attacker–Defender Games with Continuous Efforts,” Payappalli, Zhuang, Jose ▪ “Improving Learning and Adaptation in Security Games by Exploiting Information Asymmetry,” He, Dai, Ning ▪ “Behavioral theories and the neurophysiology of reward,” Schultz ▪ “Evolutionary Security,” and “Measuring Security,” Dan Geer 131

Slide 132

@swagitda_ /in/kellyshortridge kelly@greywire.net 132