GDPR Compliance for your Datastore

A presentation at London Web Performance Meetup May 2020 in May 2020 in London, UK by Emanuil Tolev

Slide 1

Slide 1

GDPR Compliance for Your Datastore Emanuil Tolev @emanuil_tolev

Slide 2

Slide 2

Also known as “not a GDPR expert” I’m good at making moussaka and bean soup though @emanuil_tolev am a EU citizen ^ had a small business

Slide 3

Slide 3

It’s a cultural artefact now @emanuil_tolev Can you recommend a GDPR expert? Yes! Great, can you give me their email address so I can contact them? No.

Slide 4

Slide 4

Joke credit @wardrox https://twitter.com/wardrox/status/988363811479572483 @emanuil_tolev

Slide 5

Slide 5

Questions in chat :) @emanuil_tolev

Slide 6

Slide 6

General Data Protection Regulation Adopted 2016/04/14 Enforceable 2018/05/25 @emanuil_tolev Behind those bulky terms there is an EU regulation that is not a paper tiger. We will dive into where and to who this applies, what is covered, and how you can work with this regulation. Fines: Whatever is greater

Slide 7

Slide 7

Where & Who? EU organizations Services or goods for / monitoring of EU citizens @emanuil_tolev

Slide 8

Slide 8

Fines 2 tiers: up to 10m EUR or 2% of turnover up to 20m EUR or 4% of turnover @emanuil_tolev meaningful fines

Slide 9

Slide 9

of turnover s’up in the UK? Equifax (500k - old DPA law), BA (Jul 2019), Mariott (Jul 2019) Crown Prosecution Service Guess the amount Facebook was fined for Cambridge Analytica! @emanuil_tolev unencrypted DVDs with police recordings lost by CPS Facebook was 500k, because it was under the DPA - max under the old legislation

Slide 10

Slide 10

What? Personal Data Any information relating to an identified or identifiable natural person @emanuil_tolev Personal data means any information relating to an identified or identifiable natural person — name, contact details, IP,… There is also sensitive personal data which includes race, sex, trade union membership where your protections should be stricter. So this will include your server logs up to your marketing campaigns. But what are the actual rights natural persons get?

Slide 11

Slide 11

Rights? to be informed access rectification @emanuil_tolev 8 data protection rights Right to be informed: You must tell individuals how and why youʼre collecting and processing their data Right of access: You must let people know how youʼre using their data and allow them to check youʼre doing it legally Right to rectification: If youʼve made a mistake in someoneʼs data, you must correct it

Slide 12

Slide 12

Rights? erasure (to be forgotten) restrict processing data portability @emanuil_tolev Right to erasure: Also known as the right to be forgotten, in some circumstances an individual can request that you delete data about them Right to restrict processing: You can still store the data but an individual can ask you stop using it Right to data portability: People must be able to get hold of the data you hold on them and then use it elsewhere

Slide 13

Slide 13

Rights? object automatic decision making @emanuil_tolev Right to object: If youʼre using someoneʼs data for marketing or research purposes, they can ask you to stop Rights relating to automatic decision making: This covers automated profiling, machine learning and so on (unless explicitly agreed or required)

Slide 14

Slide 14

Lawful use of data? Informed consent Contractual obligation Legitimate interest @emanuil_tolev 6 ways for lawful use of data Informed consent: The individual explicitly opts-in to the precise way you say that youʼll use their data Contractual obligation: you need to use the data in order to deliver a service the person has asked for, or that theyʼve told you theyʼre considering, and youʼre using only the data needed to fulfil that contract Legitimate interest: Perhaps the vaguest of the lawful bases, this allows you to use data if the legitimate interests of your company require it and you can show that this balances with the rights of the individual

Slide 15

Slide 15

Lawful use of data? Legal obligation Vital interests Public task @emanuil_tolev Legal obligation: this allows you to use data where the law requires you to Vital interests: You need to use the data to save someoneʼs life Public task: This applies most to public authorities and allows for the use of personal data if itʼs in the public interest

Slide 16

Slide 16

Proof Required Right to collect and legally use @emanuil_tolev One of the game changers: You need to prove that you are legally using the data. Rights: When collect Use: Stay within those For every dataprocess that you have

Slide 17

Slide 17

Disclosure Within 72 hours to a member state’s “supervisory body” @emanuil_tolev

Slide 18

Slide 18

Legacy Data Stop, Check, Delete @emanuil_tolev If you find you have data that was collected in a way that doesnʼt comply with the GDPR, destroy it. Similarly, if youʼre using data in a noncompliant way, stop doing so.

Slide 19

Slide 19

What if no legal grounds? @emanuil_tolev Somebody just visits your site. How do you collect any information from them? They didn’t even had a chance to give you their consent, but you also don’t want to burn your monitoring to the ground and be blind.

Slide 20

Slide 20

  1. Stop Your Service @emanuil_tolev One solution is locking out EU users. Which isn’t really a good solution and won’t work for European companies.

Slide 21

Slide 21

Can be a site just for reading or an entire service

Slide 22

Slide 22

unroll.me is doing this for example

Slide 23

Slide 23

  1. Drown them in forms @emanuil_tolev You could make everybody consent before doing anything on your site. But is this really a great idea? The cookie permissions are already a major annoyance. Who is always clicking “accept” when something pops up? IMO this is a training problem and important stuff gets lots in all the irrelevant consent

Slide 24

Slide 24

https://twitter.com/rianjohnson/status/999730569641525248 This might then look like this: Before you can start the film / website, you need to go through this. And you would actually watch this one

Slide 25

Slide 25

  1. Pseudonymization @emanuil_tolev

Slide 26

Slide 26

Anonymous No information that could potentially identify an individual Not considered Personal Data by GDPR @emanuil_tolev

Slide 27

Slide 27

Pseudonymous Re-identification possible if combined with additional information Without this information, reidentification practically impossible @emanuil_tolev

Slide 28

Slide 28

When? Ingestion time Search time @emanuil_tolev When do you change your data? Let’s assume we want to do it at ingestion time, because it saves us a lot of hassle later on

Slide 29

Slide 29

Slide 30

Slide 30

fingerprint { method => “SHA256” source => [“ip”] key => “${FINGERPRINT_KEY}” } mutate { add_field => { ‘[identities][0][key]’ => “%{fingerprint}” ‘[identities][0][value]’ => “%{ip}” } } mutate { replace => { “ip” => “%{fingerprint}” } } @emanuil_tolev

Slide 31

Slide 31

The service can even enrich data with pther known records. This does not offer enough protection for pseudoanonymization (in my opinion). You need to implement this properly.

Slide 32

Slide 32

Access Control & Encryption @emanuil_tolev

Slide 33

Slide 33

Slide 34

Slide 34

Deletion @emanuil_tolev

Slide 35

Slide 35

“Interesting #GDPR solution for the “right to erasure” : Encrypt all user’s data and when you have to delete it you just get rid of the private key. Will this become the norm?” https://twitter.com/Stephan007/status/985103374118014976 @emanuil_tolev One of the more clever approaches for personal data.

Slide 36

Slide 36

“[…] personal data of our users can only be persisted when it is encrypted. Each user has their own set of keys […] it reduces the impact of leaking a dataset, since the dataset by itself is useless — attackers also need the decryption keys. […] it allows us to control the lifecycle of data for individual users centrally.” https://labs.spotify.com/2018/09/18/scalable-user-privacy/ @emanuil_tolev This is exactly what Spotify is doing. Though this is more of an application feature, so we are not covering it in detail. It helped keeping their microservice architecture simple, since deleting data everywhere becomes a major hassle otherwise. Another option they considered was a central datastore and everything else basically only caches data. Though with various access patterns (email or profile picture) this was deemed too complicated. Article goes into a lot of details around Padlock: a global key-management system

Slide 37

Slide 37

Conclusion @emanuil_tolev

Slide 38

Slide 38

Data Protection The new standard and norm of approaching personal data @emanuil_tolev Even if it sounds difficult for some, this is by design the new standard and way to approach personal data. It’s not an afterthought any more

Slide 39

Slide 39

Special category: racial, ethnic, religious, political, biometric,…

Slide 40

Slide 40

I am not a lawyer @emanuil_tolev I am not a lawyer, sorry.

Slide 41

Slide 41

As a dev agency / consultancy @emanuil_tolev Generally we determined clients were data controllers and we were data processors But when we wanted to run a SaaS service we became data controllers. Even though in practice our (university) clients told us their reqs. I’d err on the side of more responsibility.

Slide 42

Slide 42

Heather Burns https://www.smashingmagazine.com/2018/02/gdpr-forweb-developers/ @emanuil_tolev She’s far better than me and I only read this post today. You should read it.

Slide 43

Slide 43

❤ GDPR and carry on @emanuil_tolev Regulations are everywhere, so don’t panic. Even a coffee cart comes with legal implications: food safety laws, commercial operation laws, municipal laws, administrative laws, employment law,… Generally: Do the right thing and you will be fine

Slide 44

Slide 44

@emanuil_tolev And don’t handle it like zoom.us — yes or yes is not an appropriate way to do this.

Slide 45

Slide 45

Why care? Stick Carrot Godwin @emanuil_tolev I want to be a good person and an upstanding citizen. Why is that so boring sometimes?! Well, it’s all about framing. We all get the stuff about being fined. But why should you spend your limited brain cycles on this? Europe is densely populated and we cannot help but stick our nose in each other’s business. It’s kinda silly to think your local hair salon could expose your email address in a breach, but data protection law comes from a long and sombre line of privacy violations and data gathering in Europe. You’ve probably all heard of the use of census data by the Nazi regime in Germany in the late 1930s. It was processed for storage earlier by IBM who almost certainly didn’t it to be used as it was. Personal opinion time! Sometimes, to make a business decision we must invoke emotion as well as fact. If you want to invoke something other than boredom when thinking of data protection, then invoke a sense of duty towards people’s privacy and build businesses and systems which respect that privacy. Only collect data the data you need. Question if you need pieces of it at the product design stage. Only use data as you need it. Only store data as long as you need it. If you collected something earlier and want to retire the functionality that uses it - drop that data, or archive it with encryption far from your live systems. This, in my opinion as a web dev, is the spirit of the GDPR regulation.

Slide 46

Slide 46

Questions? Emanuil Tolev @emanuil_tolev @emanuil_tolev