Three Core Challenges in Data Encryption
Part 2 of 4 in the series "The Road to Privacy in Personal AI".
Part 2: Three Core Challenges in Data Encryption
When talking about confidential and privacy-preserving technologies, a key principle is data encryption and cybersecurity. Using cryptography and sophisticated encryption algorithms, we can encrypt sensitive data with cryptographic keys, and only the parties that hold the decryption key can decrypt the data, ensuring data integrity and protection against unauthorized access.
Working with encrypted data, there are 3 primary encryption challenges to cover:
Encryption at rest - Securing stored data
Encryption in transit - Securing data on the move
Encryption in use - Working on your data securely
1. Encryption at rest - Securing stored data
The term "at rest" in data encryption means that the data is not moved, and not in use - it's often stored in some way or form, for example on a disk. Implementing encryption methods for data at rest is relatively straightforward with a good encryption standard, typically an encryption algorithm that is widely recognized and vetted for its ability to effectively secure data. Common encryption solutions include AES (Advanced Encryption Standard) or RSA (Rivest-Shamir-Adleman), which serve as robust security measures against data breaches.
An example of sensitive information could be your patient records stored on a healthcare server. This data must be encrypted and protected by proper access controls to prevent unauthorized access and cyber threats.
Luckily this is a solved problem through proper key management and encryption strategy, so we won't dig deeper into this.
2. Encryption in transit - Securing data on the move
Sharing a document over the cloud, processing credit card transactions, or submitting a contact form are all use cases of how data in transit moves from one point to another.
When transferring sensitive data between parties, we want to ensure data security so that only the authorized parties involved can read the data, especially for customer data like credit card information.
Many protocols exist to solve this problem including
HTTPS (HyperText Transfer Protocol Secure): Safeguarding the integrity and confidentiality of data between the user's computer and the site.
SSL (Secure Sockets Layer) and TLS (Transport Layer Security): These twin protocols secure connections, with TLS being the newer, more secure sibling of SSL.
SSH (Secure Shell): Ensures secure remote login and other network services.
IPsec (Internet Protocol Security): Protects data in transit via a framework of protocols and encryption standards.
A notable hero in this narrative however is "Asymmetric Encryption" which has given us many good solutions such as digital signatures and authentication. Even blockchain technologies like Bitcoin or Ethereum use asymmetric encryption with public key and private key pairs to create and verify digital transactions (also known as "public key cryptography").
Asymmetric encryption uses pairs of private keys and public keys to encrypt and decrypt plaintext into ciphertext. The "Public Key Infrastructure" (PKI) plays the role of a trusted third party, authenticating the identities involved in the data exchange.
3. Encryption in use - Working on your data securely
This is the active battlefield in the digital world and this is where you should worry.
Your sensitive data is constantly being processed by services. Like when you get personalized recommendations to improve your sleep, find an optimal travel route to your destination, or scroll through your Instagram feed.
These data computations are done to tailor experiences to you but so far it has not been possible to make computations on encrypted data while maintaining data privacy.
Traditionally, computations on data require the data in plaintext (unencrypted). This means in (almost) all scenarios, your data gets decrypted before any processing. Thus, the entity, for example, Instagram, that is processing your data becomes a party you must trust with your data security.
Who do you trust with your data?
It's an age-old dilemma of having to choose between utility and security measures.
In other words, you have to trust the party doing computation on your data. Now, data is not very useful if we don't do anything with it, so we have accepted the tradeoff.
You share your entire resume, profile information, and ideas with LinkedIn so they can help you connect, share, and discover.
We now enter a delicate dance in this relationship:
You somehow try to minimize the data you share in order to get what you need. No reason to write about your family picnic on LinkedIn, right?
LinkedIn, in its endeavors to earn your trust, does its best to create written legal policies, including GDPR compliance, so you will be more comfortable with sharing your data with them.
The more data you provide LinkedIn the more personalized experience they will give you. They will continue to try and motivate you (e.g. through incentives such as better visibility in their search towards job recruiters) to share more information with them. This often leads to you over-sharing. Maybe sharing that family picnic in a post to show you're a family man and a good colleague?
Regulators have entered the scene, trying to regulate this relationship and give more rights to you, the user. It's a step in the right direction, but learnings from how most companies have dealt with GDPR and PCI DSS show that one thing is what companies are saying to their users and the public, and another thing is what they actually do internally.
The stark reality? There's always someone in these organizations who can access your data. You can always find written proof in the company's "Privacy policy", as we found on LinkedIn:
“It is possible that we will need to disclose information about you when required by law, subpoena, or other legal process or if we have a good faith belief that disclosure is reasonably necessary”
There is no difference between our example on LinkedIn and most other services you interact with today.
Referring back to a future where everyone will have personal AI companions filled with personal information; imagine 10 years of your personal data, your behaviour, your deepest thoughts, is hosted by a company with the same no "encryption in use" and the same privacy policies as LinkedIn and they make use of this sentence "if we have a good faith belief that disclosure is reasonably necessary".
Ouch….
The future gold will be your personal data and mastering the art and science of Encryption in Use through advanced encryption solutions and proper data protection is crucial to safeguard your right to privacy and to reach global trust.