Training data
Can't you just train it on literally all binaries that exist?
Please Login to reply.
Is there a stack overflow of people asking questions about binaries?
Is that required? I'd argue just directly ingesting all the software that exists directly avoids all the garbage that comes with human language. Maybe there's not enough data though.