I partly agree. It really depends on what one is using the models to accomplish. With LocalAI and vLLM, it's pretty easy to use an open-source model as a drop-in replacement for the OpenAI API. Companies like OpenAI are working towards AGI, while many challenges can be solved using a Mixture of Agents (Experts) approach to AI agents without need for expensive hardware. With MemGPT, AutoGen, and many others leading the way toward autonomous agents. NousResearch beat OpenAI to announcing a 128k context window using YaRN. A year from now, I'd say it will be a non-issue, or we'll have context windows of 3M+. The Law of Accelerating Returns rings very true in the LLM and generative AI space. If one is looking for unaligned (uncensored) models, Dolphin is most likely the best in terms of size versus performance at 7B parameters. Many 7B models are now equaling the performance of much larger 70B models like LLaMA2. We can already overcome a lot of hardware limitations by quantizing models (see GGUF and AWQ). At the current exponentially growing rate, in a year, we'll more than likely have AGI.
My company, nostr:npub14hujhn3cp20ky0laq93e4txkaws2laxp80mfk3rv08mh35qnngxsg5ljyg (proudly on nostr!), is releasing some very cool stuff in the near future related to AI agents and domain-specific, fine-tuned, lightweight, and efficient models that will run on edge, mobile, and IoT devices. There is a lot of non-OpenAI projects out there that are Open Source and transparent and more by the day.