Most LLM benchmarks are typically designed with specific targets in mind, such as coding or language understanding. However, I believe the time is ripe for also having cross-model challenges. I was curious to see if anyone has already explored or implemented this approach.

Reply to this note

Please Login to reply.

Discussion

No replies yet.