**RT @garrytan:**
Cedana (YC S23) lets you save, load, and migrate any compute workload. Meta and Alphabet had this years ago, but now anyone has access to this.
For AI inference this is a godsend: Cedana enables ChatGPT-fast latency while boosting GPU utilization 5x
ycombinator.com/launches/JAP… (https://www.ycombinator.com/launches/JAP-cedana-real-time-compute-migration)
https://nitter.moomoo.me/garrytan/status/1687199533275725824#m