There’s one more option I’ve recently discovered that might be a new spot on the spectrum of tradeoffs… So there are GPU providers like modal.com that let you spin up GPU environments and only pay for the seconds or minutes it’s up and running. So you could create a container boots up with a newly generated key, messages get decrypted, run through the model, encrypt the outputs and send back to user, and the plaintext content never leaves RAM or VRAM. So the cloud provider should theoretically have a harder time to spy on you, even though they could…. And because this is on demand, you can use big open source models.

Reply to this note

Please Login to reply.

Discussion

No replies yet.