Nostr Web Client

Katie

07eced8b63b883cedbd8520bdb3303bf9c2b37c2c7921ca5c59f64e0f79ad2a6

ML & Data Engineering Consultant | Reader | Knitter I write code for https://nostr.wine

https://katieannbaker.com

Re: early response to the medium list in general: This is all 3.5.

4 is significantly smarter. 5 will be too

The same limitations apply… it’s still an LLM limited to a knowledge domain and is acting as an approximation machine. We still need some pathways for generating high quality data for it to be trained off of… if we use it’s “intelligence” as an excuse to stop producing that data, it will stop learning.

<old>cypherhoodlum

Yeah maybe stackoverflow will have less duplicate questions and more genuinely new questions which would actually remove a lot of overhead and even reduce hosting costs overall for the site. LLMs may be a good thing for sites like stackoverflow.

Yeah, I agree it should help keep duplicate questions off the platform. And it’s for sure in chaptGPTs best interest that humans continue to produce high quality data, ranked by quality even, for more effective training. I think the ultimate and best solution is one where both systems work together. A massive part of why chatGPT is so good is because the curation of data on stackoverflow is so good. Garbage in, garbage out, as the old adage goes.

What about an LLM with plenty of access to plugins and APIs?

There’s a fascinating paper on GPT-4 that discusses this if you haven’t seen it.

Sparks of Artificial General Intelligence: Early experiments with GPT-4

See the tool use section

Yeah, the more it has access to for sure increases it’s capabilities. Like giving it access to modify your terraform to deploy infrastructure. I think it’s really compelling, but I also think there is still a long ways to go. Another ML engineer in a community that I’m in put together a great piece on the current state of AI:

https://medium.com/@khalil.alami/the-boring-state-of-artificial-intelligence-152244075d7f

That would still be an issue if another AI created the missing data and hadn’t shared it yet. So AutoGPT could for example be asking the other AI what the latest info is if an AI made it not a human.

Data sharing negotiations for and by AI systems 🤣

The crazy thing is though - nothing is stopping us from doing exactly that!

For sure!! I’m not saying these things aren’t possible. But I don’t think they’re solved by an LLM alone.

Yeah, I think the only reason we aren’t 100% on the same page, is I tend to talk about its capabilities if we allow it out of its sandbox.

Not sure if you’ve played with AutoGPT or seen the plug-in things people have done. For the sandboxed tool most people use - I’d agree, it’s frozen in time and unable to ask anyone for support.

Thinking more generally about its capabilities, I think it could handle all that. AutoGPT for example could find the contact info of the rep, and ask them what’s going on

Right! But it’s still relying on a human in this context. Maybe it breaks our current ecosystem by not then storing that information into a publicly accessible place like stack overflow and makes us dependent on the knowledge or hordes. But what you’re describing still required missing data to be provided to it by a human.

Also - think about security bugs that are reported. It knows about them because they get reported on centralized platforms. Security pen testers find them. Right now, it’s simply a language model… it has no ability to execute pen testing. It relies on humans to produce content (release notes, open source code, quality reports or GitHub threads) to learn. It’s still very much piggy backing off of human generated data. And not to say models won’t generate that data in the future… but it will be a series of models with different tasks that communicate with each other.

The other thing to note… models that generate data and then train themselves on that data that they generated, tend to get further and further away from meaningful content. If you had a model that wrote novels and eventually it’s training only on the models it wrote. They would start to all sound the same and converge. Part of why it is so compelling is that the data it is trained off of is created by a large number of humans with different perspectives and goals. You’d have to create a lot of individual models with niche goals and perspectives to sustainably let it just create indefinitely without new human content.

How do you solve that?

Depends on the language and/or support. Could be that someone figure it out through a conversation or a guess or checking site package source code or decompiling code. Or if it’s truly a closed source project, via a client support rep. Ultimately someone posts about it somewhere and then chaptgpt knows the new argument.

Ok, so let’s say there is an update to a package that isn’t open source. I’m using the new version and can’t figure out why it’s breaking. It has no idea that the new version changed an argument name and the one I’m using is deprecated. Maybe if the release notes are in a place it has access to and can probabilistically relate them together. But if the information isn’t online (not conclusions from data points, but actually missing data points), it can’t solve that. That’s what a lot of stack overflow questions are. That’s way so many are so old… they are asked when the data points don’t exist in an accessible place. If you’re asking a question on stack overflow for data that already exists you’re just going to be pointed to an older ticket.

Also - think about security bugs that are reported. It knows about them because they get reported on centralized platforms. Security pen testers find them. Right now, it’s simply a language model… it has no ability to execute pen testing. It relies on humans to produce content (release notes, open source code, quality reports or GitHub threads) to learn. It’s still very much piggy backing off of human generated data. And not to say models won’t generate that data in the future… but it will be a series of models with different tasks that communicate with each other.

What is it failing to figure out for you? Usually when I ask it something novel, something that should be nowhere in its training set, it figures it out.

Ok, so let’s say there is an update to a package that isn’t open source. I’m using the new version and can’t figure out why it’s breaking. It has no idea that the new version changed an argument name and the one I’m using is deprecated. Maybe if the release notes are in a place it has access to and can probabilistically relate them together. But if the information isn’t online (not conclusions from data points, but actually missing data points), it can’t solve that. That’s what a lot of stack overflow questions are. That’s way so many are so old… they are asked when the data points don’t exist in an accessible place. If you’re asking a question on stack overflow for data that already exists you’re just going to be pointed to an older ticket.

Well, even if there were no limitations, we might never know or find out .. as nobody understands how they work, including the people that create them.

I don’t think it’s fair to say no one understands how they work. Not being able to interpret weights is not the same thing.

It can take two concepts and combine them… which is what I do when I’m searching stackoverflow. Rarely does one post solve my problem. I read multiple and I synthesize that data into a solution. To me, that’s not the same as answering net new questions for new problems. And it is for sure a “not yet” problem, but I don’t think we’re there yet.

To give it access to do this it would need to 1. Read all code for all projects and understand every implementation and version and/or 2. Have access to execute code to observe and trouble shoot. It’s an LLM… it’s just synthesizing data that exists, and drawing conclusions and derivations from that. Impressive? Yes. Better than search? Yes. Has access to derived thought based on things that have happens that haven’t yet been put on the internet… no, not yet.

It really does. It doesn’t just use previous code or substitute your code for similar code.

It can take two concepts and combine them… which is what I do when I’m searching stackoverflow. Rarely does one post solve my problem. I read multiple and I synthesize that data into a solution. To me, that’s not the same as answering net new questions for new problems. And it is for sure a “not yet” problem, but I don’t think we’re there yet.

With chatgpt I wonder how much traffic StackOverflow has lost 🤔

I imagine this will have a huge impact on freelance platforms too as there are large numbers of developer and design tasks - both of which will be fully automated in the near future.

We’ve had a lot of discussions about this. Chaptgpt can’t really answer any engineering questions that haven’t previously been answered on stackoverflow (or other sites, but I mean, it’s pretty much it). So it may pull some traffic from the lurkers who are just trying to get their questions answered. But if a question doesn’t already have an answer, someone will have to ask it to a human in stack overflow. Essentially, chaptgpt just offers better search for it, but doesn’t actually replace the recommendations for net new problems.

Replying to Angela

In recent years I worked with a beautiful person who was beloved by many people. He transitioned onward today. His wife and two college age children will now be navigating through without his physical presence.

Cheers to the courage it takes to live alongside grief and to the knowledge that love is eternal. 💜

https://tidal.com/track/60933795

💜

Cameri🐦‍🔥

I personally enjoy vacuuming... We’d drown in cat hair otterwise. This one has a laser light that shows you the hair and particles on the floor in low light! No idea how we lived before this vacuum…

Whoa, that sounds nifty 🐈 I’ll have to look into that

Cameri🐦‍🔥

Mixed feelings about Zappr tho

Truth.

I really wish someone was working on a way where you could just drop your body off for repairs while your consciousness gets put in a loaner so you can get on with life.

I’m scheduled to be filleted like a fish on Thursday. I’d just rather skip this whole process.

I’m sorry 😢 I hope it goes smoothly and recovery is swift!

Stephen Patrick 🕺🏼

Any swing dancers in here? I’m from Charleston, SC and host social dance parties. I love old-timey jazz music and getting high. I also love surfing, sunshine, and stacking sats! Let’s go #introductions #plebchain #dance #grownostr #swingdance #charleston

Not exactly swing, but I used to do shag dancing in college. Purely for fun, socially, and I’m not very good 🥴 but it was always so much fun!

Sorry for the absence. I was focusing on Nostr for Heath in the past week. Lots to work yet (I can't say much), but support is growing.

#[0]

I’m super excited to see what comes out of the work you’re doing with health. Nostr as a solution for specific industries really has a lot of value. Thanks for all that you do for the community!

Load More