The solution to this is to have a separate prompt (on a tiny shitty model) run to determine first if any tools should be run at all, and if so, further drill down to what categories or individual tools should be presented to the tool-using model.
Dead simple queries either don't use the tool-capable model at all or are just outright answered by the prompt evaluator.
This is how AgentV3N solves this problem while also bringing down the cost of allowing *every* query potentially use a tool.