This thing: https://github.com/pemistahl/lingua-go

And requiring a confidence score of at least 0.9. It was good in my tests but I've only tested Portuguese.

I'll decrease it to 0.85.

And relax the ratelimit a little bit.

Reply to this note

Please Login to reply.

Discussion

sounds good, thanks! looks like anglicisms and internet slang reduce the confidence score.

I'm using this library on adre.su too, but I have to say, the Confident Score didn't help me at all. I tried their "light mode", implemented rate limits, and in the end, I gave up on checking short messages. But honestly, it all doesn't work very well. The best results came from manual calibration with similar languages specified (for each one used), but this whole language thing takes a lot of resources, and manual setup even more so.

I don't know, after removing URLs and URIs and trying a bunch of examples with Spanish, Portuguese and Chinese I'm yet to find a single false positive or negative.

But also I'm only testing if some text matches one specific language, not trying to "detect", so maybe that helps.