This is not to say that testing, evaluating and aiming to reduce hallucinations is not a good thing. It is!
The same way as measuing uptime, and investing in ways to further improve it is also great (and sensible).
Keep doing it: but don't say stuff that is non-viable in the real world

Source: x.com/GergelyOrosz/status/1843268813367722300