It's a totally different approach, but maybe you're right, good LLMs are relatively new and maybe could be considered to supersede fuzzy hashing. But the main problem of reverse-engineering the compression algorithm (which is one way to think about llms) still exists. If you're thinking of working on this I'm happy to see what I can do to help.
Right. But FYI, you don't need microsoft's service, you can roll your own with open source models that will return a confidence score between 0 and 1. And a lot of those models are totally open source -- https://huggingface.co/docs/transformers/en/tasks/image_classification They are just classification models which return a value between 0 and 1. And they're pretty fast & efficient since Google and other have been fighting this issue for 20+ years and have developed very good and efficient models. (Which work 99.5% of the time. I think it's impossible to get to 100%).
Discussion
We tried CloudFlare’s integrated service and Microsoft’s PhotoDNA, they are ok, but only compare to existing hashes and only supported images, not videos.. AI models scan it all, searches existing hashes and recognizes unreported patterns.