yeah, i just looked at the cost of VPS+GPU and it's pretty nuts. i know how much performance you are gonna get out of one 16gb GPU and it's probably enough to answer like 5-6 queries per minute. that's useful if you can find a way to make more than $1000 a month out of its service at a price of $350/month. otherwise, no.
i'd say if you have already got a network services business and write an agent to process data on your system in a way that adds value, and it lets you bump your magin higher than it might be useful. but honestly, at this point, economies of scale are not in favor of using them, compared to highly optimized expensive data centers full of them.
i expect this to change though, since the incentive to manufacture fast GPU memory and processors is definitely there, really, the bottleneck in the technology is not so much about the ability to perform hash functions but the speed of the memory, and this can be increased just by parallelism, think 512bit wide buses. or SANs on the devices so they can divide up the work efficiently.
but i think that we are still really in beta stages with this tech, parallelisation is still probably some ways in the future.
however, it's already good enough i don't see any sane reason why a programmer would not have an LLM and coding agent on their workstation. it's far cheaper than any VPS service and i honestly doubt that the costs of cloud services are fully reflective of the production cost yet, there is so much VC money flooding into it they are subsidising it in order to create loss leaders.