Nostr Web Client

#Ditto now displays daily streaks! To have a streak, no more than 24 hours can pass since your last post.

The streak seems to cap out at 181 days. I haven't done the math to figure out why, maybe that's when I last recreated the Ditto database?

A rare few power users are tied at the cap.

nostr:nprofile1qydhwumn8ghj7emvv4shxmmwv96x7u3wv3jhvtmjv4kxz7gqyqewrqnkx4zsaweutf739s0cu7et29zrntqs5elw70vlm8zudr3y2sndk4n nostr:nprofile1qydhwumn8ghj7emvv4shxmmwv96x7u3wv3jhvtmjv4kxz7gqyqlhwrt96wnkf2w9edgr4cfruchvwkv26q6asdhz4qg08pm6w3djgzwr0kg

Alex Gleason 11mo ago

#Ditto now displays daily streaks! To have a streak, no more than 24 hours can pass since your last post.

Replying to

Alex Gleason

>users don't even know what's happening

Like I said, the only barrier is ignorance. I'm not saying it's good, but if people have to, they can learn.

Alex Gleason 11mo ago

On the Fediverse you can't simply learn something and then be like "let me move to a different server". I mean you kind of can, but only with permission from the admin, who is already your adversary at this point.

Replying to unknown

If a user is censored on a major app for android, users don't even know what is happening and this is extreme censorship. In the case of Amethyst and Iris, the user don't even know exactly how it works because they are not told.

This is worse than Mastodon because you know what you are getting on Mastodon, an echo chamber

Nostr and Amethyst tell you it is all about freedom and your choice. That is a lie. Apps are becoming tyranical and listening to bogus reports to censor users beased on feelings and wrong-think. They think they are fleeing censorship and winding up in a pool of censorship

Alex Gleason 11mo ago

>users don't even know what's happening

Like I said, the only barrier is ignorance. I'm not saying it's good, but if people have to, they can learn.

Replying to unknown

But when Apps censor people automatically for the user by default, that's when Nostr breaks and is worse than #FediBlock

Alex Gleason 11mo ago

Strong disagree on that one. The only barrier there is human ignorance, since people can just switch to a different app and lose nothing. They don't lose their posts or followers, and can pick up right where they left off.

The barrier on the Fediverse is technological. After you join a server you can't change your mind.

Replying to

Matty-kun

I thought the whole part of Nostr was decentralization so you didn't have things like this lmao

Alex Gleason 11mo ago

People can use their keys to go elsewhere on the network.

Replying to unknown

Nostr repeating Mastodon's fuck-ups more and more

nostr:nevent1qvzqqqqqqypzp99x579946am5s9ar24zydypqyev9kqqfwu3waskcsfa8sxa7vswqyv8wumn8ghj7ur0wd6x2u3wwpkxzcm99aex2mrp0yqzqgn9znkzcxp3mcdr4ypv039j4nrtq8323n02jvhkhj6ag85cs3rvwy6f4z

Alex Gleason 11mo ago

Looks like they're trying to be the Cloudflare of Nostr.

Replying to

Hoshi

this is not Japanese text just because it contains a Katakana character: \_(シ)_/

Alex Gleason 11mo ago

You caught me. But if a "Translate" button appeared on that, it would just be funny, not annoying.

Replying to KentuckyChicken

Can you set up a lightening wallet so I can zap you ?

Alex Gleason 11mo ago

Check again

Alex Gleason 11mo ago

I fixed a problem with relay.mostr.pub which was causing new events to not be served on its relay.

nostr:nevent1qvzqqqqqqypzqj67hazxwe8rxpjy6stzjf9gdfeu29esnr9mqrds56gtp9cqdgywqydhwumn8ghj7emvv4shxmmwv96x7u3wv3jhvtmjv4kxz7gqyzjg4nynhudsnyyjgg3tfjrae3uagr93zqj6xzm2exzsltg43ww3z4vtepc

Replying to

Alex Gleason

Your NIP-05 is too slow or we're too slow to verify it

Alex Gleason 11mo ago

My bad... I had a bug in my policy script. It's fixed now. Thank you for reporting it!

Replying to

HoloKat

You tried models specifically tailored to translation?

Alex Gleason 11mo ago

Yep, this one right here: https://github.com/fabiospampinato/lande

Replying to

Kieran

It passed the "good enough" test for me, but i have really low standards

Alex Gleason 11mo ago

I used fasttext years ago on the Fediverse, and people complained about it a lot. So this time I'm using an obscure library I found digging through GitHub issues: https://github.com/fabiospampinato/lande

This one is specifically trained on small text, making it ideal for shitposts, but it still doesn't even take the character set into account.

Replying to

Kieran

I use fasttext, its really accurate and prevents wasted DeepL calls for automatic translations

Alex Gleason 11mo ago

fasttext is one of the ones that failed for me. Too many wrongly detected languages. It's trained on Wikipedia data, not shitposts.

Alex Gleason 11mo ago

Language detection is surprisingly difficult. The neural networks get basic things wrong. They will say that Korean text is actually Chinese, even though you can obviously see with your eyes that it's not.

After multiple libraries failed this basic test, I did something brave and implemented a naive regex solution in #Ditto. It does a first pass on the text before moving it on to the neural network.

안녕하세요

For example, if ALL the characters are in Korean script, it must be Korean. Even if it's a nonsensical sequence of Korean characters, it cannot be any language other than Korean due to the fact Korean makes exclusive use of this character set.

There are only a few languages where this is possible: Korean, Greek, and Hebrew.

Again, this is only possible if ALL characters in the text match a target language, so simply using "π" in a text does not make it Greek. So, currently this check is very narrow.

Notes about other languages:

Chinese: it's not possible to do a regex-only solution for Chinese, since Han script is also part of Japanese.

Japanese: we *can* definitively detect Japanese, as long as the text contains at least one Hirigana or Katakana character in addition to 0 or more Han characters. So at least *some* Japanese text can be unambiguously detected just by a regex.

Russian: Cyrillic text is used by a handful of languages besides Russian. BUT, if the text is entirely Cyrillic, that at least narrows down the *possible* languages it could be.

Next steps:

To optimize this, the regexes will narrow down possible languages of a text before passing it to the neural network.

For example, if a text is entirely Han, we would restrict the model to deciding only between Chinese and Japanese. If it's Cyrillic, we'd do the same thing, but with the 6 or so Cyrillic languages.

We could also try to match, say, 90% of the text instead of 100%, to any specific script, to catch outliers like occasional English words used in Japanese, etc. We are already stripping things like punctuation, emojis, and URLs before passing text to the model.

Finally, this is all so we can use a lightweight, embedded solution for language detection, instead of calling out to some proprietary API, or even a giant self-hosted solution. In that case, I believe a layered solution will always be needed. We have to do these naive checks to put "guardrails" on the model, so its guesses can't stray outside of common sense. Switching the model can improve it, but these naive checks will still be true.

Replying to

Captain's Nostr Log

nostr:npub1q3sle0kvfsehgsuexttt3ugjd8xdklxfwwkh559wxckmzddywnws6cd26p

Alex Gleason 11mo ago

Your NIP-05 is too slow or we're too slow to verify it

Replying to

Alex Gleason

I accidentally documented it wrong, it's actually "protocol:atproto"

This will show some results from when I was mirroring posts from eclipse.pub here, but I haven't figured out the best way to show everything that doesn't involve spending thousands of dollars on SSDs.

Alex Gleason 11mo ago

I can't even get it to load.

Replying to

Sprate

Is the "protocol:bluesky" parameter intended to be functional yet? Or is it still a WIP?

Alex Gleason 11mo ago

I accidentally documented it wrong, it's actually "protocol:atproto"

This will show some results from when I was mirroring posts from eclipse.pub here, but I haven't figured out the best way to show everything that doesn't involve spending thousands of dollars on SSDs.

Alex Gleason 11mo ago

#Ditto search now supports negative search tokens. That means you can use -protocol:activitypub to remove everything from the bridge, or even -language:de to remove the Germans.

Alex Gleason 11mo ago