Are there good tools to make benchmark testing easy? I’m trying to test with MMAU but kinda wish it was easier than copy/pasting the GitHub and getting ChatGPT to build a script for me.

It’s taking a lot of back and forths, needing to grab sample data from huggingface, reviewing output of script to make sure it’s formatted correctly, etc…

Source: x.com/yoheinakajima/status/1848058566571393105

Reply to this note

Please Login to reply.

Discussion

No replies yet.