Yeah currently NVIDIA GPU + CUDA is the best way to run Stable Diffusion, or Google Colab.
But pytorch 2 should have also better CPU support and it could improve on AMD Radeon
Here some Stable Duffusion benchmark, i didn't find a direct comparison with M2 M1 but those cpus are still very new.
https://www.tomshardware.com/news/stable-diffusion-gpu-benchmarks