Three Core Challenges in Data Encryption
Part 2 of 4 in the series "The Road to Privacy in Personal AI".
Part 2: Three Core Challenges in Data Encryption
When talking about confidential and privacy-preserving technologies, a key principle is data encryption. Using cryptography, we can encrypt data with secret keys, and only the parties that hold the key can decrypt the data.
Working with encrypted data, there are 3 primary challenges to cover:
Encryption at rest - Securing stored data
Encryption in transit - Securing data on the move
Encryption in use - Working on your data securely
1. Encryption at rest - Securing stored data
The term “at rest” in data encryption means that the data is not moved, and not in use, it’s often stored in some way or form, for example on a disk. Encrypting data at rest is relatively straightforward with a good encryption scheme, typically an encryption algorithm that is widely recognized and vetted for its ability to effectively secure data, for example, AES (Advanced Encryption Standard) or RSA (Rivest-Shamir-Adleman).
An example of stored data could be your patient records stored on a hospital server. This data must be encrypted to prevent unauthorized access.
Luckily this is a solved problem so we won’t dig deeper into this.
2. Encryption in transit - Securing data on the move
Sharing a document over the cloud, doing an online credit card transaction, or submitting a contact form are all examples of how data is moved from one point to another.
When transferring data between parties, we want to make sure that only the parties involved can read the data, for example when you make that credit card transaction online.
Many protocols exist to solve this problem including
HTTPS (HyperText Transfer Protocol Secure): Safeguarding the integrity and confidentiality of data between the user's computer and the site.
SSL (Secure Sockets Layer) and TLS (Transport Layer Security): These twin protocols secure connections, with TLS being the newer, more secure sibling of SSL.
SSH (Secure Shell): Ensures secure remote login and other network services.
IPsec (Internet Protocol Security): Protects data in transit via a framework of protocols and encryption standards.
A notable hero in this narrative however is “Asymmetric Encryption” which has given us many good solutions such as digital signatures and the ability to securely exchange emails. Even blockchain technologies like Bitcoin or Ethereum use asymmetric encryption to create and verify digital transactions (also known as “public key cryptography”).
Asymmetric Encryption uses pairs of private keys and public keys to encrypt and decrypt data. The “Public Key Infrastructure” (PKI) plays the role of a trusted third party, authenticating the identities involved in the data exchange.
3. Encryption in use - Working on your data securely
This is the active battlefield in the digital world and this is where you should worry.
Your data is constantly being processed by services. Like when you get personalized recommendations to improve your sleep, find an optimal travel route to your destination, or scroll through your Instagram feed.
These data computations are done to tailor experiences to you but so far it has not been possible to make computations on encrypted data.
Traditionally, computations on data require the data in cleartext (unencrypted). This means in (almost) all scenarios, your data gets decrypted before any processing. Thus, the entity, for example, Instagram, that is processing your data becomes a party you must trust.
Who do you trust with your data?
It’s an age-old dilemma of having to choose between utility and security.
In other words, you have to trust the party doing computation on your data. Now, data is not very useful if we don’t do anything with it, so we have accepted the tradeoff.
You share your entire resume, profile information, and ideas with LinkedIn so they can help you connect, share, and discover.
We now enter a delicate dance in this relationship:
You somehow try to minimize the data you share in order to get what you need. No reason to write about your family picnic on LinkedIn, right?
LinkedIn, in its endeavors to earn your trust, does its best to create written legal policies, etc, so you will be more comfortable with sharing your data with them.
The more data you provide LinkedIn the more personalized experience they will give you. They will continue to try and motivate you (e.g. through incentives such as better visibility in their search towards job recruiters) to share more information with them. This often leads to you over-sharing. Maybe sharing that family picnic in a post to show you’re a family man and a good colleague?
Regulators have entered the scene, trying to regulate this relationship and give more rights to you, the user. It’s a step in the right direction, but learnings from how most companies have dealt with GDPR and CCPA show that one thing is what companies are saying to their users and the public, and another thing is what they actually do internally.
The stark reality? There's always someone in these organizations who can access your data. You can always find written proof in the company's “Privacy policy”, as we found on LinkedIn:
“It is possible that we will need to disclose information about you when required by law, subpoena, or other legal process or if we have a good faith belief that disclosure is reasonably necessary”
There is no difference between our example on LinkedIn and most other services you interact with today.
Referring back to a future where everyone will have personal AI companions filled with personal information; imagine 10 years of your personal data, your behaviour, your deepest thoughts, is hosted by a company with the same no “encryption in use” and the same privacy policies as LinkedIn and they make use of this sentence “if we have a good faith belief that disclosure is reasonably necessary”.
Ouch….
The future gold will be your personal data and mastering the art and science of Encryption in Use is crucial to safeguard your right to privacy and to reach global trust.