I just don’t know if a “domain blacklist” will
work well. I think it will be too slow, too incomplete, and ineffective. I think the only way to do this at scale is that relays have a way to score images and videos and simply be sure to delete and not re-broadcast any which get a bad score.
I think the only sane way to debug this might be to visit the links provided in my last message?
Right. Actually baking it into a client is a thing that would be the most dangerous too.
You don't have to worry about definitions. These models are very smart and are happy to provide you with a float between zero and one. And then you just set a threshold on the what scores you will tolerate. No need to engage further with the question.
Right. That's what I thought. And it works for you, right? You've been able to measure the benefits in terms of fewer complaints or something?
So are you doing this only by keyword? Or something else?
Right. But FYI, you don't need microsoft's service, you can roll your own with open source models that will return a confidence score between 0 and 1. And a lot of those models are totally open source -- https://huggingface.co/docs/transformers/en/tasks/image_classification They are just classification models which return a value between 0 and 1. And they're pretty fast & efficient since Google and other have been fighting this issue for 20+ years and have developed very good and efficient models. (Which work 99.5% of the time. I think it's impossible to get to 100%).
Here's an example: https://huggingface.co/Falconsai/nsfw_image_detection ... putting one of these (or actually multiple, and averaging the results...) behind an API endpoint is not too difficult, and I'd be happy to do it for any service which has a **way to measure the effectiveness** ... since I will not be reviewing any images manually (!) , and YOU will not be reviewing any images manually (1) and I will be deleting all data a few milliseconds after it hits the model and returns a score, you must have SOME way of deciding if the service is useful. Like, user complaints, or blocks, or something like that.... ideally if you run a big enough service where you can measure "complaints/blocks per day" and see that the "number goes down" when you start using the scores that I provide.
As discussed in this thread, making these scores public is potentially dangerous, but providing a service that simply scores images, especially if that service is only offered to a small number of entities who can be trusted to use it only to help them delete something .... is something Microsoft has been doing for decades, I can't see any particular risk in it.
But I only want to build this if someone can say "yes, I'll be able to measure the effectiveness somehow"... because doing this without measurement of any kind of useless, right?
Right. But FYI, you don't need microsoft's service, you can roll your own with open source models that will return a confidence score between 0 and 1. And a lot of those models are totally open source -- https://huggingface.co/docs/transformers/en/tasks/image_classification They are just classification models which return a value between 0 and 1. And they're pretty fast & efficient since Google and other have been fighting this issue for 20+ years and have developed very good and efficient models. (Which work 99.5% of the time. I think it's impossible to get to 100%).
Broadcasting public notes that identify CSAM is probably illegal, because it could be construed as "advertising" that content. I think the only option we really have long-term, at least in the US, is for someone(s) to run a service that crawls the network, matches images against microsoft's hash database product (closed source, but for good reasons, since hash databases can be reverse engineered), and reports matches to NCMEC. A bonus would be to do the same thing but analyze note text for exploitation keywords. Privately hosted and encrypted content are pretty much immune to this fortunately/unfortunately. nostr:nprofile1q9n8wumn8ghj7enfd36x2u3wdehhxarj9emkjmn99ah8qatzx96r2amr8p5rxdm4dp4kzafew3ehwwpjwd48smnywycrgepndcu8qd3nx36hguryvem8xdr5d56hsmt5xfehzemtxejxkeflvfex7ctyvdshxapaw3e82egprfmhxue69uhhyetvv9ujumn0wd68yanfv4mjucm0d5hszrnhwden5te0dehhxtnvdakz7qg3waehxw309ahx7um5wgh8w6twv5hsz9nhwden5te0wfjkccte9ekk7um5wgh8qatz9uqzpxvf2qzp87m4dkzr0yfvcv47qucdhcdlc66a9mhht8s52mprn7g98p5le2 currently checks a hash database for all images uploaded, and I believe they report matches.
As non-cypherpunk as this all is, I think it's the only real option we have unless Ross Ulbricht's ZKANN idea gets built. We need to demonstrate to anyone watching that we take the problem seriously and take measures to self-regulate. This is similar to the bitcoin KYC/AML argument. If we don't want financial surveillance or legal restrictions on social media, we should help law enforcement actually chase down the people who are the problem rather than presenting ourselves as the scapegoat. See iftas.org for some work being done in the fediverse on this.
Agreed on the "broadcasting identifications are probably illegal". If only there was a provably safe way to do it where the exact location of the content (i.e., URL) , wasn't communicated but you gave clients a 99% certainty of being able to block it, still. basically, somehow you give clients the power to test any url to decide if it was bad, but you don't provide the actual scores in such a way as the scores could be directly used in a search or other content discovery exercise... Maybe this is just impossible, because, what's to stop someone else from running a script to test every image and reproduce the score, and then try to use or access the bad scores?
I just fear it's technically impossible to tell someone to delete something, but also NOT allow them to run a search to find all the things that they should be deleting.
Sure, but by "delete it" -- this is obviously what Google and Instagram do .. but Nostr is a distributed system, right? We can't actually delete it from the internet! -- but if we had a safe way to report it to other clients, that would actually exceed requirements of the law, right? So like: "delete it, report it, safely assist other clients to delete it".
Right, well our job is to report it and find a "safe" and "fair" way to block it.
All agreed except for one possible improvement -- what if it were possible to run an automated service that could proactively look for these images and somehow publish a score in a "safe" way -- a score that could only be used to PREVENT clients from being shown the note with the bad image... but could never be used to SEARCH for such images....
I am not trolling.
I do think it would be good to have a system for identifying harmful stuff. It would be a nice workaround that would work today and I would definitely adopt it at https://njump.me/ because we keep getting reports from Cloudflare. I tried some things but they didn't work very well, so if you know how to do it I'm interested.
However the long-term solution is paid relays, community relays, relays that only give access to friends of friends of friends, that kind of stuff.
OK, so thinking about it more, in light of what nostr:npub1q3sle0kvfsehgsuexttt3ugjd8xdklxfwwkh559wxckmzddywnws6cd26p says ... 1) Obviously the spec to use would be the LABEL spec nip-32 -- not sure why I didn't figure that out to begin with... https://github.com/nostr-protocol/nips/blob/master/32.md 2) My original idea of "publicly publish a score for each image" is completely impossible and terrible idea... because, of course, the bad guys could actually just use the service in the reverse way that it's intended to be used! ....... Anyway, 1/2 of the problem -- running a service which produces scores -- is completely something I could do -- basically process millions of images and spit out scores for them -- but the other 1/2 ... how to let clients or relays use these scores WITHOUT also giving them a "map to all the bad stuff" at the same time...? I'm not smart enough currently to come up with a solution. It might involve something fancy involving cryptography or "zero knowledge proofs" or things that are generally out of my intellectual league.
Actually, now that I think about it, this "public score" thing could potentially be a stupid idea, since, what's to stop baddies from actually USING the database of "bad scores"? Yeah. I see what you are saying.
OK so it would be a "label" event, with a score: https://github.com/nostr-protocol/nips/blob/master/32.md -- great. So, question is, will any clients actually want to consume/use these labels?
Hm.... You don't have to download the content, and you don't have to train a model. There are a TON good open-source models you can use for this kind of scoring... https://huggingface.co/docs/transformers/en/tasks/image_classification ... and -- don't tell me that a script operating on a server which iterates over EVERY image published to a large number of servers, simply pulls in the image data and gives it a score and deletes the image -- that this is somehow illegal. That's impossible.... This is literally what Google Images and Bing Image and a zillion other services literally do all day! It's a crawler & analyzer!
The obvious solution is to just give up working on it and spend the next few years smoking weed and going snowboarding instead. But the actual solution I think has to be some kind of distributed scoring system.
Not sure if you are serious or just trolling the idea. But -- like each individual relay implements its own scoring system? Seems like a ton of duplicated effort.
For working with an NWC string, use nostr:npub1yxp7j36cfqws7yj0hkfu2mx25308u4zua6ud22zglxp98ayhh96s8c399s thing: https://supertestnet.github.io/nwc_tester/ ... you'll really want to test it with that BEFORE trying it in a client
#asknostr among the problems that Nostr faces, the child porn problem is a very, very, very bad problem.
A VERY bad problem.
What is the current thinking among developers about how to deal with this?
Nobody likes censorship, but the only solution I can think of (SO FAR) is running an image identification service that labels dangerous stuff like this, and then broadcasts a list of (images, notes, users?) who are scoring high on the "oh shit this is child porn" metric. Typically these systems just output a float between zero and 1, which is the score....
Is anyone working on this currently?
I have a good deal of experience of running ML services like image identification at scale, so this could be something interesting to work on for the community. (I also have a lot GPU power, and anyway, if you do it right, this actually doesn't take a ton of GPUs to do even for millions of images per day....)
It would seem straightforward to subscribe to all the nostr image uploaders, generate a score with 100 being "definite child porn" and 1 being "not child porn", and then broadcast maybe events of some kind to relays with this "opinion" about the image/media?
Maybe someone from the major clients like nostr:npub1yzvxlwp7wawed5vgefwfmugvumtp8c8t0etk3g8sky4n0ndvyxesnxrf8q or #coracle or nostr:npub12vkcxr0luzwp8e673v29eqjhrr7p9vqq8asav85swaepclllj09sylpugg or nostr:npub18m76awca3y37hkvuneavuw6pjj4525fw90necxmadrvjg0sdy6qsngq955 has a suggestion on how this should be done.
One way or another, this has to be done. 99.99% percent of normies, the first time they see child porn on #nostr ... if they see it once, they'll never come back.....
Is there an appropriate NIP to look at? nostr:npub180cvv07tjdrrgpa0j7j7tmnyl2yr6yr7l8j4s3evf6u64th6gkwsyjh6w6 ? nostr:npub1l2vyh47mk2p0qlsku7hg0vn29faehy9hy34ygaclpn66ukqp3afqutajft ? nostr:npub16c0nh3dnadzqpm76uctf5hqhe2lny344zsmpm6feee9p5rdxaa9q586nvr ?
I have done a ton of video dev stuff, let me know if u have questions.
nostr:npub137c5pd8gmhhe0njtsgwjgunc5xjr2vmzvglkgqs5sjeh972gqqxqjak37w nice Damus script to get zaps working! U run razko node? We run megalithic.me