Replying to Avatar Rizful.com

#asknostr among the problems that Nostr faces, the child porn problem is a very, very, very bad problem.

A VERY bad problem.

What is the current thinking among developers about how to deal with this?

Nobody likes censorship, but the only solution I can think of (SO FAR) is running an image identification service that labels dangerous stuff like this, and then broadcasts a list of (images, notes, users?) who are scoring high on the "oh shit this is child porn" metric. Typically these systems just output a float between zero and 1, which is the score....

Is anyone working on this currently?

I have a good deal of experience of running ML services like image identification at scale, so this could be something interesting to work on for the community. (I also have a lot GPU power, and anyway, if you do it right, this actually doesn't take a ton of GPUs to do even for millions of images per day....)

It would seem straightforward to subscribe to all the nostr image uploaders, generate a score with 100 being "definite child porn" and 1 being "not child porn", and then broadcast maybe events of some kind to relays with this "opinion" about the image/media?

Maybe someone from the major clients like nostr:npub1yzvxlwp7wawed5vgefwfmugvumtp8c8t0etk3g8sky4n0ndvyxesnxrf8q or #coracle or nostr:npub12vkcxr0luzwp8e673v29eqjhrr7p9vqq8asav85swaepclllj09sylpugg or nostr:npub18m76awca3y37hkvuneavuw6pjj4525fw90necxmadrvjg0sdy6qsngq955 has a suggestion on how this should be done.

One way or another, this has to be done. 99.99% percent of normies, the first time they see child porn on #nostr ... if they see it once, they'll never come back.....

Is there an appropriate NIP to look at? nostr:npub180cvv07tjdrrgpa0j7j7tmnyl2yr6yr7l8j4s3evf6u64th6gkwsyjh6w6 ? nostr:npub1l2vyh47mk2p0qlsku7hg0vn29faehy9hy34ygaclpn66ukqp3afqutajft ? nostr:npub16c0nh3dnadzqpm76uctf5hqhe2lny344zsmpm6feee9p5rdxaa9q586nvr ?

Broadcasting public notes that identify CSAM is probably illegal, because it could be construed as "advertising" that content. I think the only option we really have long-term, at least in the US, is for someone(s) to run a service that crawls the network, matches images against microsoft's hash database product (closed source, but for good reasons, since hash databases can be reverse engineered), and reports matches to NCMEC. A bonus would be to do the same thing but analyze note text for exploitation keywords. Privately hosted and encrypted content are pretty much immune to this fortunately/unfortunately. nostr:nprofile1q9n8wumn8ghj7enfd36x2u3wdehhxarj9emkjmn99ah8qatzx96r2amr8p5rxdm4dp4kzafew3ehwwpjwd48smnywycrgepndcu8qd3nx36hguryvem8xdr5d56hsmt5xfehzemtxejxkeflvfex7ctyvdshxapaw3e82egprfmhxue69uhhyetvv9ujumn0wd68yanfv4mjucm0d5hszrnhwden5te0dehhxtnvdakz7qg3waehxw309ahx7um5wgh8w6twv5hsz9nhwden5te0wfjkccte9ekk7um5wgh8qatz9uqzpxvf2qzp87m4dkzr0yfvcv47qucdhcdlc66a9mhht8s52mprn7g98p5le2 currently checks a hash database for all images uploaded, and I believe they report matches.

As non-cypherpunk as this all is, I think it's the only real option we have unless Ross Ulbricht's ZKANN idea gets built. We need to demonstrate to anyone watching that we take the problem seriously and take measures to self-regulate. This is similar to the bitcoin KYC/AML argument. If we don't want financial surveillance or legal restrictions on social media, we should help law enforcement actually chase down the people who are the problem rather than presenting ourselves as the scapegoat. See iftas.org for some work being done in the fediverse on this.

Reply to this note

Please Login to reply.

Discussion

Agreed on the "broadcasting identifications are probably illegal". If only there was a provably safe way to do it where the exact location of the content (i.e., URL) , wasn't communicated but you gave clients a 99% certainty of being able to block it, still. basically, somehow you give clients the power to test any url to decide if it was bad, but you don't provide the actual scores in such a way as the scores could be directly used in a search or other content discovery exercise... Maybe this is just impossible, because, what's to stop someone else from running a script to test every image and reproduce the score, and then try to use or access the bad scores?

Right, that's basically how microsoft's product works. They don't release the database because you could then craft images that foil the fuzzy hashing algorithm they use.

Right. But FYI, you don't need microsoft's service, you can roll your own with open source models that will return a confidence score between 0 and 1. And a lot of those models are totally open source -- https://huggingface.co/docs/transformers/en/tasks/image_classification They are just classification models which return a value between 0 and 1. And they're pretty fast & efficient since Google and other have been fighting this issue for 20+ years and have developed very good and efficient models. (Which work 99.5% of the time. I think it's impossible to get to 100%).

We use one of these models.

Right. That's what I thought. And it works for you, right? You've been able to measure the benefits in terms of fewer complaints or something?

Yes, easier and quicker to identify, less we have to do. We still use PhotoDNA for their hashes, but is the best solution so far.

It's a totally different approach, but maybe you're right, good LLMs are relatively new and maybe could be considered to supersede fuzzy hashing. But the main problem of reverse-engineering the compression algorithm (which is one way to think about llms) still exists. If you're thinking of working on this I'm happy to see what I can do to help.

We tried CloudFlare’s integrated service and Microsoft’s PhotoDNA, they are ok, but only compare to existing hashes and only supported images, not videos.. AI models scan it all, searches existing hashes and recognizes unreported patterns.

Here's an example: https://huggingface.co/Falconsai/nsfw_image_detection ... putting one of these (or actually multiple, and averaging the results...) behind an API endpoint is not too difficult, and I'd be happy to do it for any service which has a **way to measure the effectiveness** ... since I will not be reviewing any images manually (!) , and YOU will not be reviewing any images manually (1) and I will be deleting all data a few milliseconds after it hits the model and returns a score, you must have SOME way of deciding if the service is useful. Like, user complaints, or blocks, or something like that.... ideally if you run a big enough service where you can measure "complaints/blocks per day" and see that the "number goes down" when you start using the scores that I provide.

As discussed in this thread, making these scores public is potentially dangerous, but providing a service that simply scores images, especially if that service is only offered to a small number of entities who can be trusted to use it only to help them delete something .... is something Microsoft has been doing for decades, I can't see any particular risk in it.

But I only want to build this if someone can say "yes, I'll be able to measure the effectiveness somehow"... because doing this without measurement of any kind of useless, right?

That's a good approach, but could be tricky on nostr. Maybe you could scrape NIP 56 reports? I know those still get published by some clients (which is awful, but I can't convince the devs to stop).

It's not awful.

It's illegal

Awful is very different from illegal.

Ok, but legality has bearing on awfulness

You don't have to worry about definitions. These models are very smart and are happy to provide you with a float between zero and one. And then you just set a threshold on the what scores you will tolerate. No need to engage further with the question.

Semantics aside, I'd use that if I ran a relay. I'm not sure it makes sense to bake it into a client? Especially since it would be sending the data to a third party server

Right. Actually baking it into a client is a thing that would be the most dangerous too.

Yes, more than often what is awful is legal and what is good is illegal.

We use an AI model to recognize and report it, if it is not sure we manually confirm and report. It’s not tolerated in any way and we will do what is needed to rid it of nostr.

Same as only getting notes from specific relays, one could only load media from specific domains that do csam filtering, like nostr build.

Maybe the client app could do run some local AI model & scan for the "other" domains that host the media.

ATproto folks use this apparently

https://www.thorn.org/solutions/for-platforms/

Anyone can run a relay, it's cheap:

Probably because of that cost Bluesky say they'll mix in something AI from this newly-launched Roost.

https://blog.mozilla.org/en/mozilla/ai/roost-launch-ai-safety-tools-nonprofit/