It's paid by call and limited to the models they offer, but depending on your use case this might do the job. Renting gpus is extremely expensive afaik.
I run GPU stuff mostly on a ws.
If you want to run existing open source models https://replicate.com/ was working good when I tested it.
Discussion
Yeah, it is really expensive to fully rent GPU server 😅
Do you utilized fully of your workstation GPU memory or still have some left?
Thanks for replicate suggestion. Yes, the price quite cheap if the request process run shortly. But, it seems I need "private model" to be setup on their service.
Based on your experience, did they calculate purely based on how long the request to be processed? Let's say we send request every second but the request only need 100ms to be processed, they calculate only those 100ms right?