Avatar
Dustin Dannenhauer
da18e9860040f3bf493876fc16b1a912ae5a6f6fa8d5159c3de2b8233a0d9851
DVM maximalist Building DVMDash - a monitoring and debugging tool for DVMs https://dvmdash.live Live DVM Stats here: https://stats.dvmdash.live Hacking on ezdvm - a python library for making DVMs https://github.com/dtdannen/ezdvm

I've long wanted a decentralized alternative to reddit so I’m very excited about #nostrudel

I tried creating a community called ai-papers (nostr:naddr1qqykz6fdwpshqetjwvq3yamnwvaz7tm0venxx6rpd9hzuur4vgqs6amnwvaz7tmwdaejumr0dspzpkscaxrqqs8nhaynsahuz6c6jy4wtfhkl2x4zkwrmc4cyvaqmxz3qvzqqqyx7ctnxh3s)

But I see it doesn’t show when you go to https://nostrudel.ninja/#/communities/explore

Is there a plan to show what communities exist without logging in?

nostr:npub1ye5ptcxfyyxl5vjvdjar2ua3f0hynkjzpx552mu5snj3qmx5pzjscpknpr

Here's my review of the ChatDev paper, let me know what you think!

https://arxiv.org/abs/2307.07924

Qian, C., Cong, X., Yang, C., Chen, W., Su, Y., Xu, J., ... & Sun, M. (2023). Communicative agents for software development. arXiv preprint arXiv:2307.07924.

This paper presents a new approach to software development where many calls to LLMs in different roles (CEO, CTO, programmer, reviewer, etc) build an entire software project. The novelty of the paper seems to be the specific roles of the LLMs and the flow of calls between LLMs to design, write, review, and test code. I also liked that there were artistic agents that made assets to be used in the software (like player icons and button icons). My biggest issue with the paper is that it doesn’t formally define “thought instruction” or provide clear enough examples of it and the software projects it generated were small at only a few hundred lines of code. Experimentally, I’m not sure how well it generalizes because it is not clear if the evaluation dataset was used in the training of GPT 3.5.

Questions I had about this paper:

1. Are we sure the dataset for instruction-following (Camel[23]) is not in the training set of the LLM? If so, perhaps these results won’t generalize well to new software projects.

Comments:

1. Gpt 3.5 was used instead of 4, so maybe the results will be better when using Gpt 4

2. One of the key contributions seems to be the “thought instruction” mechanism, but there is no clear example of exactly what that is. On pages 6 and 7 it says “thought instruction includes swapping roles to inquire about which methods are not yet implemented and then switching back to the provide the programmer with more precise instructions to follow”. Is that all “thought instruction” is? I recommend a more formal or complete description of it in the paper.

3. Lines of source code for the projects was pretty small, with the max being 359 lines of code generated in one project.

4. The paper makes a big deal about the price of their approach being tiny ($0.2967) but given that the projects are so tiny, this is not necessarily cheaper than human developers for more realistic software systems if the cost does not scale linearly with the number of lines of code.

5. “Fortunately, with the thought instruction mechanism proposed in this paper, such bugs can often be easily resolved by importing the required class or method.” - this sentence is vague and doesn’t fully explain how “thought instruction” solves these types of bugs.

6. In the discussion section it says this approach to software development is “training-free” but the LLM has to be trained, so that’s not exactly fair to say.

7. I appreciated the examples in the appendix, however I wish the paper was clearer about exactly what “thought instruction” is, such as providing an example with and without thought instruction.

Awesome, would love to talk to you about PlebAI! And sharing organizing responsibilities is also great! Did you see we scheduled a meet and greet on Nov 15 here?

https://www.meetup.com/nova-nostr-meetup-group/events/296975741/

nostr:npub1l2vyh47mk2p0qlsku7hg0vn29faehy9hy34ygaclpn66ukqp3afqutajft the note I'm replying to was first a draft on shipyard.pub and then I scheduled it and it's now posted, but on primal and nostur it shows up as a note but nothing in it. Did I make a mistake on shipyard when moving it from a draft to scheduling it? Or is this a client rendering issue? One thing i did was to hand type nostr pubs (so I wrote nostr:npub... for two people I wanted to reference) in case that is what broke it (tab complete wasn't working for those folks).

Great to know. I saw there was a BitDevs meetup but haven’t been to one yet. I do want to put some energy into a Nostr event - I can let you know so you can pass on to any folks in your group that may be interested.

Haha love it - it’s a bit of a hike for me to get out there but I’m sure it’s worth it :)

I can’t make it this coming weekend but I joined the meetup group and will look for the next one!

Awesome! Seems there’s some interest, would happen 2nd week of November or later. Will send out a poll for dates / times soon.

I’ve been thinking of organizing one in the Washington DC / Northern Virginia area - I don’t believe there’s one here yet.

If you’re in the Washington DC area come check out the World Culture Festival - it starts tomorrow!

Taking up the whole national mall, more than 17,000 performers from all over the world will perform. Free to attend!

https://wcf.artofliving.org/

$prism$ $boost$ (am I doing this right?)