Subnostr

Which means the thing also works well offline, with no access to an AI. Just need a relay with some indexes, and you'll probably still find what you're looking for. If the relay has better indexes, you'll find even more.

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 7mo ago

yup, just need to have an AI model/query engine alongside the locally stored data. it would be viable for both local and remote services, in almost identical configuration.

Reply to this note

Please Login to reply.

Discussion

Laeserin 7mo ago

Yes, I describe this in the quoted thread.

Laeserin 7mo ago

I know people have been staring at us, wondering why the only team chock full of mathematicians, computer scientists, electrical engineers, cognitive scientists, physicists, and data analysts, is the only project that isn't "doing AI stuff".

We _are_ doing it, but we're spending more time thinking about how to do it _well_, and practicing different techniques, because it needs to be useful going forward, or we're wasting our time.

halalmoney 7mo ago

Me: I’m so smart, reading this fine conversation between Laeserin and David.

Also me:

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 7mo ago

the main reason why i have something to contribute is because i'm working intensively on my pet project, https://x.realy.lol to do a really fast, comprehensive search engine including standard filters and full text indexes, and in the process as i elaborate the ideas i get from what i learn about how they work, realising exactly how AI assistants on search engines work.

i think the "replacing humans" chatter is way overstated, because really this is just like any tool that increases productivity, it reduces the manpower needed to do a given task. all technology ultimately displaces human labor and that's the reason why it's invented. i think that some people, who are misanthropic and cruel minded, actually hate all people and fantasize about living in a world where everyone says yes to them instead of pointing out valid points of disagreement, or even, getting so mad at the actions being taken that they take up arms. these people increasingly use soft power, psychological manipulation, but then when that fails they bring out their hired guns (and in the future that will be drones too).

best to focus on how it empowers us, because we can also defend ourselves with soft power, and that's really central to the whole mission of bitcoin and nostr, in my opinion.

Laeserin 7mo ago

They will use AI to replace the humans

...unless we figure out a way to leverage AI to make humans *more* than the AI alone. So that the AI turns the human into a Superhuman, rather than obsoleting humans.

Which is why we have to think, carefully, about the best way to implement AI, rather than just vibe-coding in an LLM and calling it a day.

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 7mo ago

as i see it, the core principle battlefield is about having the ability to evaluate information in the face of the endless barage of bullshit designed to confuse and divide people

we are in the beginning phases of this battle, up to recently, they only had "big data" datamining to do this, but now they are teaming that up with language models to improve their ability to find things, but not just to find them (since they have roped most of the world onto the social networks to feed them data) but also to fabricate data to poison the discourse

this also highlights the idea that we can potentially also use these tools offensively to poison our own data. there was a meme someone posted about this a while back to deliberately jumble and confuse their communications to obstruct AIs. this could be done with AI LLMs more effectively.

on the defensive side, this is why AUTH is so important, we need to be able to restrict access to our discourse so it doesn't become intelligence data for attacks on us.

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 7mo ago

like, one way to do the data poisoning would be to have an AI mangle your texts, and tell the server to return the poisoned version to the public readable side, while only users tagged in a conversation can read the real text

Laeserin 7mo ago

Poison pill all the things. Interesting.

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 7mo ago

it can be done client side also, but you have to be able to tell the relay which is poison and which is ambrosia

Lucas M 7mo ago

Good to know that I'm not the only nostard observing this discussion.

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 7mo ago

i'm just back on nostr after 3 days hiatus... she got me thinking about what AI search assistants actually means functionally... and figured out what it means.

part of the reason why is i have been actually using the AI assistant on my intellij IDE and it has a row of buttons to include context and this lighted bulbs in my brain "ohh, the AI is using the data as part of the prompt"

haha duh, it's actually duh but i doubt almost anyone even realised it. UI can change everything

Lucas M 7mo ago

I have a question regarding your relay: since it's going to be an essential component for this search engine your building, what form will it take upon completion? Will it be like a typical search engine like Brave or DuckDuckGo or something?

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 7mo ago

it's just a http accessible event database with filters + extra + fulltext search capability

the extra is it lets you query by instance-internal references to events (database serial keys), it lets you just get event IDs also, instead of the whole event, and it will have a full text search that lets you get the event IDs as results

all of this is core database functionality, and something i've learned from my work at the fiat mine is that very often internal things are useful. and that cat, the snipey, whiny one who has an ok business serving nostr clients, apparently, gave me the idea of opening up the internal references as it is a far simpler sync mechanism than retardonegentropy, because actually, nodes are subjective, and their internal references make a lot more sense as a way to request data.

doesn't mean they'll give it tho, some data is private so there would be gaps but that's literally internal, and private, it's just there because it is the simplest way to enable another party to sync data. this is my zero, ok, here it is. etc. the database search can also find these by all the possible other ways but it's a one-stop to just ask it for "the event you call 203"

Lucas M 7mo ago

If you search for event IDs, wouldn't it just pull up the whole event regardless?

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 7mo ago

if you don't add an extra index as i have that includes id, pubkey, kind and timestamp

oh, my docs are inaccurate. i wrote a variable length integer encoding for this, but anyway:

// FullIndex is an index designed to enable sorting and filtering of results found via

// other indexes, without having to decode the event.

// [ prefix ][ 32 bytes full event ID ][ 8 bytes truncated hash of pubkey ][ 2 bytes kind ][ 8 bytes created_at timestamp ][ 8 serial ]

with this index, you can get the id without decoding the event, you can filter out pubkeys and kinds, and sort them in ascending or descending order of timestamp.

it's the bulkiest index in the tables i designed but it's there to avoid decoding events. and anyway, i also made it so decoding events is as fast as possible, using a streaming decoder scheme i wrote by hand, which is the fastest binary codec for nostr events there is, i don't see how anyone can make it faster or more compact. ah yes, and i did add further logic so e and p tags are compactly stored as the binary, the p tags i had to make an exception because some clients put hashtags in them in follow lists. so the p tags are 1 byte, 32 bytes for the pubkey, so 33 bytes. which is literally half as big as hex encoding.

i also added some further indexes, pubkey/kind and pubkey/created_at and ... well i'm not finished defining them exactly, my goal is to enable searching the event without decoding it within 2 or 3 index iterations. this is the optimal.

i also have written tests that show how big the events and indexes are, 203mb of events (my cache from my realy.mleku.dev) with 64mb of indexes, and the actual binary store of events is 130mb. so it works out to store index and event without complex, expensive compression, to about the same as the raw json of the events. i classify that as a total win.

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 7mo ago

haha lol great example of how comments on stuff become lies. the serial is the first field, for reasons of it being faster to find them in the tables. lol. i need to fix that.

Lucas M 7mo ago

Do you mind if I dm you?

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 7mo ago

check my profile and use telegram i think

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 7mo ago

obviously it might not be ideal. i will revive my matrix element installation

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 7mo ago

nostr:npub1whgjzsdrjxv5csrz2q032hpwxnjp4rpulxl0nexh62vz2dzc683qh8wqu9 matrix would be more secure so: @mleku17:matrix.org

Lucas M 7mo ago

How about simplex?

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 7mo ago

i have to install it

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 7mo ago

ok it was already installed, i have added my simplex address to my profile metadata, click on my avatar to get at it