Surely they will eventually combine the various types of models into a single product to create a cohesive product, but maybe instead of waiting for that you should just move over to a diffusion model to make your desired imagery.

GPT is primarily a language model. OpenAI is working on diffusion models, primarily their video generation model called sora, but language models will never do imagery as good as diffusion models, so until they have something that’s combining both types of models I wouldn’t expect anything close to perfection or even “good”

Reply to this note

Please Login to reply.

Discussion

No replies yet.