I'm dealing with some sensitive data that I need to process using a language model (LLM). After processing, I want to encrypt this data before storing it in a bucket. It's important that I don't use the default Key Management Service (KMS) for the encryption. Additionally, I need to ensure the data can be safely decrypted client-side, ideally using something like WebCrypto. The main concern here is that I want to prevent this sensitive information from being exposed to the cloud infrastructure at any point. Can anyone validate my approach or provide suggestions?
5 Answers
If you're not hosting your own LLM, that could be a significant security risk. Make sure you assess the security of the LLM you're using.
Using a Customer Managed Key (CMK) in KMS for your S3 objects is a solid approach. It allows you to retain control over your encryption keys and ensure the data is secure during storage.
Consider implementing layer 7 encryption right within your application. This gives you complete control and may suit your needs if you're wary about trusting AWS.
If you're comfortable with AWS, utilizing KMS encryption like a CMK could give you assurance that the data isn’t exposed to the cloud infrastructure. On the other hand, client-side encryption offers another layer of security, especially if you use Server-Side Encryption with Customer-Provided Keys (SSE-C). It helps ensure AWS never has access to the encryption keys.
One thing to consider: how do you plan to ensure that the data remains encrypted while it's being processed in memory by the LLM? That's usually a tricky spot.
Encrypting data at rest and in transit is usually straightforward, but keeping it secure while it's actively being processed can be more challenging.
Great point! I’m definitely considering how to handle that aspect, as I want to make sure the data is protected throughout its lifecycle.