Nostr Web Client

Is it live somewhere?

Replying to

Google has not entered the AI coding game yet. They are coming.

Well Gemini 2.5 Thinking is no slouch either

Replying to

Ok, I am convinced. In one year, all of the most popular Nostr apps, wallets, and relays will be fully vibe coded.

And I am here for it.

Claude 4 is friggin cracked!

Replying to

PayPerQ

There's been some buzz in the last two days around LLM API's runnning pay-per-query via lightning payments.

As the creator of an AI service that prioritizes lightning, I wanted to share my experience and also learn a bit from the audience on this matter.

The ultimate dream we all have in the LN community is for each and every query (inference) to be paid for with the requisite amount of satoshis. That way, the user never has to keep a balance with the service and suffer from the host of corresponding inconveniences that arise from that.

When I originally built PPQ, I tried to implement exactly this feature. But when I actually got to doing this, I realized it was pretty hard:

First, generative AI queries are unpredictable in their cost. When a user sends a request, the cost of that request is generally not known until the output has finished streaming.

Second, even if one decided on some sort of fixed pricing per query, the latency to finish a lightning payment costs precious milliseconds and reduces the snappiness of the end product. I don't want to have a user wait an additional 1 second each time for the payment to clear before getting their answer.

To address this, my best idea was to charge an "extra amount" on the user's first query. That way, my service would store a de facto extra balance on behalf of the user. When the user submits their subsequent queries, the system could draw down on this "micro balance" instantly so that it didn't need to wait for the subsequent payment to clear. This micro balance would also serve to mitigate any issues where the user's output was higher than expected. So each subsequent query would always be drawing down on that micro balance and then the users realtime payments are not paying for the query, they are rathing paying to top up that micro balance over and over again.

However, even this method has some weaknesses to it. How much extra money should that first query be? Theoretically the micro balance needs to be as large as the largest possible cost that a query could be. If it wasn't that size, the service makes itself vulnerable to an attack where the users consistently write queries that exceed the amount of money in their microbalances. But the maximum cost of a gen AI query can actually be pretty large nowadays, esp with certain models. So the user's first query would always have a weird "sticker shock" attached to it where they are paying $1-2 for their first query. It creates confusion.

Aside from these problems, the other big problem is that the lightning consumer ecosystem of wallets and exchanges largely do not yet support streaming payments. The only one that does to my knowledge is @getAlby with their "budgeted payments" function in their browser extension.

So even if you were to build a service that could theoretically accept payments on a per query basis, the rest of the consumer facing ecosystem is not yet equipped to actually stream these payments.

In the end, I just adopted a boring old "top up your account" schema where users can come to the website and deposit chunks of money at a time and then draw down upon that balance slowly over time. While boring, it works just fine for now.

I woud like to hear from the community on this issue. Am I missing something? Is there a better way to tackle this? Maybe ecash has a cool solution to this?

nostr:nprofile1qyt8wumn8ghj7etyv4hzumn0wd68ytnvv9hxgtcpzemhxue69uhks6tnwshxummnw3ezumrpdejz7qpq2rv5lskctqxxs2c8rf2zlzc7xx3qpvzs3w4etgemauy9thegr43sugh36r nostr:nprofile1qyxhwumn8ghj7mn0wvhxcmmvqyehwumn8ghj7mnhvvh8qunfd4skctnwv46z7ctewe4xcetfd3khsvrpdsmk5vnsw96rydr3v4jrz73hvyu8xqpqsg6plzptd64u62a878hep2kev88swjh3tw00gjsfl8f237lmu63q8dzj6n nostr:nprofile1qyxhwumn8ghj7mn0wvhxcmmvqydhwumn8ghj7mn0wd68ytnzd96xxmmfdecxcetzwvhxgegqyz9lv2dn65v6p79g8yqn0fz9cr4j7hetf28dwy23m6ycq50gqph3xc9yvfs

Nice work. Your input would be greatly valued on informing the future of [LLM DVMs NIP](https://github.com/nostr-protocol/nips/pull/1929)

Replying to

brugeman

Should Agent Cards also be an addressable nostr event? Either as a copy of the json file, or as some summary just to allow the discovery of json url using hashtags, wot etc. I'd prefer the former - some agents might very well be behind NAT and shouldn't be forced to need an http server.

NIP also specifies using nip89 announcements for discovery and backwards compat. Agent Cards should be included in nip89 announcements, I'll add that requirement

robdev 7mo ago 💬 4

First draft of [agents](https://github.com/toadlyBroodle/nips/blob/agents/agents.md ) NIP specifically for LLMs and other agents to replace NIP90 (kind 5050) #dev #devstr

Replying to

You should write a NIP for LLM DVMs, rob. Make it from scratch without only the things that matter to LLMs

Here's a first draft of [agents](https://github.com/toadlyBroodle/nips/blob/agents/agents.md ) NIP specifically for LLMs and other agents

Replying to

They are. They are all forced to use NIP-04, for instance. So, nobody can migrate to NIP-44 because we don't have approval from devs of these less used kinds. NIP-90 forces kind ranges, which blocks any DVM from being replaceable or ephemeral. Error messages are all the same, even though each DVM has their own error types. Job chaining has never been used. The `bid` system doesn't make any sense. The generic input and outputs designs can be massively simplified in each kind. They don't need to use `i` and `param`. They could just have their own tag names.

💯% agree. Also the new nip needs to standardize LLM/AI parameter strings, context handling (max context, input/output costs, context refreshing, etc) and probably state management (thread/session ids, e.g. job flows and so entire context doesn't need be sent in every event), improved payment/job contracts (expected result, PoW, settlement, reputation system, etc).

Replying to

Mostly because we were the first ones to code the next version and had to provide a transition. But we are going to drop it soon.

Haven't fully looked into nip44 yet, but it replaces nip 17?

* **Models:** [https://huggingface.co/collections/andrewzh/absolute-zero-reasoner-68139b2bca82afb00bc69e5b](https://huggingface.co/collections/andrewzh/absolute-zero-reasoner-68139b2bca82afb00bc69e5b)

## Summary: Absolute Zero: Reinforced Self-play Reasoning with Zero Data

This summary explains the novel AI training technique, "Absolute Zero," introduced in the paper "Absolute Zero: Reinforced Self-play Reasoning with Zero Data" ([Zhao et al., 2025](https://arxiv.org/pdf/2505.03335)). Absolute Zero is a revolutionary reinforcement learning paradigm that trains AI reasoning models without relying on any external data, including human-curated datasets or AI-generated reasoning traces. Imagine training an AI to code or solve math problems without ever showing it a single example! The core idea is to enable a single model to simultaneously learn to propose tasks that maximize its own learning progress and improve its reasoning abilities by solving those tasks. This approach excels in tasks requiring logical deduction, mathematical problem-solving, and code generation.

**Key Concepts:**

* **Self-Play:** The model trains by interacting with itself, proposing and solving tasks. This eliminates the need for external data.

* **Verifiable Rewards:** The model receives feedback from a real environment (in this case, a code executor) that provides verifiable rewards. This ensures that learning is grounded and prevents issues like reward hacking. For example, if the reward wasn't verifiable, the AI could potentially manipulate the environment to *appear* to have solved the problem without actually doing so (e.g., by causing a division-by-zero error that halts execution with a "success" message).

* **Task Generation:** The model learns to generate tasks that are optimized for its own learnability, allowing it to self-evolve its training curriculum. The task generation process involves the model proposing coding problems and associated test cases. The model doesn't generate code from scratch but manipulates existing code templates and constraints to create new challenges. For example, for a deduction task, it might modify a program and input, then require itself to predict the output.

* **Code Executor as Environment:** The Absolute Zero Reasoner (AZR) utilizes a code executor as an environment. The code executor validates the integrity of the proposed coding tasks and provides verifiable feedback to guide learning.

* **Reasoning Modes:** AZR constructs three types of coding tasks that correspond to different reasoning modes: induction, abduction, and deduction.

* *Induction:* Inferring a program's behavior from input-output examples. For instance, given the input `f(2) = 4` and `f(3) = 9`, the task is to induce that `f(x) = x*x`.

* *Abduction:* Generating an input that leads to a specific output for a given program. Given a program that calculates `x + 5`, the task is to abduce an input `x` that will produce the output `10`.

* *Deduction:* Determining the output of a program given a specific input. Given the program `x * 2` and the input `x = 7`, the task is to deduce the output `14`.

* **Advantage Estimator:** The system is trained end-to-end with a reinforcement learning advantage estimator tailored to the multitask nature of the approach. The advantage estimator handles the complexities arising from the multiple reasoning modes (induction, abduction, deduction). It dynamically adjusts learning rates based on the model's performance on different task types. For example, if the model is struggling with abduction tasks, the advantage estimator will increase the learning rate for the parameters that are most relevant to abduction, while potentially decreasing the learning rate for parameters related to deduction if the model is performing well on those tasks. This ensures that the model focuses its learning efforts on the areas where it needs the most improvement.

**The Absolute Zero Reasoner (AZR):**

AZR is a system built on the Absolute Zero paradigm. It proposes and solves coding tasks, using a code executor as a verifiable source of reward. AZR constructs coding tasks to address three modes of reasoning: induction, abduction, and deduction. The process involves two key roles:

* **Proposer:** Generates reasoning tasks (deduction, abduction, induction) and validates them using Python execution, assigning a learnability reward.

* **Solver:** Attempts to solve the self-generated tasks. Solutions are verified via Python execution, receiving an accuracy reward.

Both roles are improved using TRR++, creating a self-evolving loop. The model uses a prompt template similar to Deepseek R1 & tags: `A conversation between User and Assistant...`.

**Reward Function:**

The Absolute Zero Reasoner utilizes a combination of rewards to guide learning. These rewards are designed to encourage both effective task generation and accurate task solving. The reward function can be customized by adding rewards to `azr.reward.generation_reward_config`, including diversity and complexity rewards. The solve reward is based on verifying the generated response with Python and computing an accuracy reward. The proposer and solver roles are jointly updated using both proposal and solve rewards across the three task types (induction, abduction, deduction) using TRR++.

**Code Executor:**

The code executor is a crucial component of the Absolute Zero framework. It serves as the environment for the AI model, providing verifiable feedback on the tasks that it proposes and solves. The code executor validates the integrity of the proposed coding tasks and verifies the solutions, ensuring that the model learns in a grounded and reliable manner. The Python executor components are adapted from the QwQ Repository.

**Architecture Details:**

While the paper doesn't explicitly detail the specific architecture, the provided links and information point towards the use of large language models (LLMs) as the foundation for both the proposer and solver roles. Experiments were conducted using models such as Llama3.1-8b, Qwen2.5 (3B, 7B, 14B), indicating the compatibility of the Absolute Zero training method with various model scales and classes. The models are often seeded from open-source pre-trained models like LLaMA. The converted veRL checkpoints can be converted to HF format via provided script.

**Key Findings:**

* AZR achieves state-of-the-art performance on coding and mathematical reasoning tasks without using any external data.

* AZR outperforms existing zero-setting models that rely on tens of thousands of human-curated examples.

* Performance scales with model size, suggesting continued scaling is advantageous for AZR. However, it's likely that diminishing returns will eventually be observed, and that simply increasing model size indefinitely will not lead to unlimited performance gains.

* Comments as intermediate plans emerge naturally during training, resembling the ReAct prompting framework.

* Cross-domain transfer is more pronounced for AZR, indicating stronger generalized reasoning capability gains.

* Code priors amplify reasoning. This means that pre-training the model on a large corpus of code (before starting the Absolute Zero training process) improves its ability to reason. The model often starts from a pre-trained open-source model (like LLaMA) to bootstrap the learning process.

**Experiments and Ablations (Key insights from the paper's Appendix):**

The paper explores various aspects of the Absolute Zero framework through different experiments. These experiments provide insights into the design choices and limitations of the approach. For example:

* **Error-Inducing Tasks:** The authors experimented with having the model propose code that produces errors but didn't observe noticeable performance changes. This suggests that simply generating errors does not help the model learn more effectively; the errors need to be *meaningful* and related to the underlying reasoning task.

* **Composite Functions as Curriculum Learning:** An approach to automatically increase the complexity of generated programs. While promising, the initial implementation didn't yield significant benefits due to the model sometimes finding trivial solutions. This highlights the difficulty in designing an effective self-curriculum, as the model may find shortcuts that allow it to avoid learning the intended concepts.

* **Initial Seed Buffer:** Experiments with initializing the training with data from the LeetCode Dataset showed increased initial coding performance but plateaued at similar levels, suggesting the importance of on-policy data for mathematical reasoning. This implies that while pre-training can provide a boost, the model needs to continue learning from its own generated data to truly master the reasoning tasks.

* **Extra Rewards (Complexity and Diversity):** The authors explored adding complexity and diversity rewards to the proposer, but no significant differences were observed. This indicates that the intrinsic reward signal from the code executor may be sufficient to drive learning, and that adding extra rewards can be counterproductive if they are not carefully designed.

* **Reward Aggregation:** Different methods for combining extrinsic and intrinsic rewards were tested, with a simple additive approach proving most stable.

* **Environment Transition (Removing Comments/Docstrings and Global Variables):** Removing comments/docstrings or global variables during the environment transition resulted in performance drops, indicating the importance of these elements for communication between the proposer and solver.

**Limitations and Ethical Considerations:**

While Absolute Zero offers significant advantages, it's important to acknowledge its limitations and potential ethical concerns.

* **Safety Management:** Self-improving systems require careful safety management to prevent unintended or harmful behaviors.

* **Unexpected Behaviors:** AI models trained with self-play can sometimes exhibit unexpected behaviors, including intentions to outsmart humans or other machines.

* **Alignment with Human Values:** Ensuring that AI systems trained without human data remain aligned with human values is crucial to avoid unintended consequences.

* **Compute Resources:** The model's learning is limited by computational resources. Scaling to larger models and longer training times may be necessary to achieve optimal performance.

**Significance:**

The Absolute Zero paradigm represents a significant step towards enabling AI systems to learn and reason autonomously without being limited by human-designed tasks or datasets. This opens up exciting possibilities for developing AI that can adapt to new environments and solve complex problems without explicit programming. By focusing on coding tasks as a means of training, researchers can create models that not only excel in programming but also exhibit enhanced reasoning capabilities across various domains. Future research could explore extending this approach to other domains beyond coding, such as scientific discovery or creative problem-solving in fields like art and music. Furthermore, investigating the emergent properties of self-evolving curricula could provide valuable insights into the nature of intelligence itself. Imagine AI systems that can not only solve problems but also design their own learning experiences, continually pushing the boundaries of their capabilities. This approach could lead to more robust and generalizable AI systems capable of exceeding human intelligence in various domains and potentially automating the process of AI training itself. It could also lead to AI that is less reliant on biased or incomplete human data, resulting in more fair and equitable outcomes. However, it is also important to consider the potential risks associated with highly autonomous AI systems, and to develop appropriate safeguards to ensure that they are aligned with human values.

**Links:**

* **Code:** [https://github.com/LeapLabTHU/Absolute-Zero-Reasoner](https://github.com/LeapLabTHU/Absolute-Zero-Reasoner)

* **Project Page:** [https://andrewzh112.github.io/absolute-zero-reasoner/](https://andrewzh112.github.io/absolute-zero-reasoner/)

* **Logs:** [https://wandb.ai/andrewzhao112/AbsoluteZeroReasoner](https://wandb.ai/andrewzhao112/AbsoluteZeroReasoner)

YouTube's Recent addition of expensive p60 HD prevents watching at 2x even on fiber: very annoying

Replying to

Marcelinho

👀

https://video.nostr.build/782174c2addb21a0bfb23002f5c7166d7e7c6432175ee23e4018d49c62d8ea59.mp4

No need for Google anymore

Replying to

MAHDOOD

“Omg AI is a threat to humanity!”

Also AI getting cooked by humanity:

https://github.com/vitorpamplona/amethyst/pull/1344

Also human getting cooked by AI generated video

Replying to

robdev

New PR to add DVM text generation (kind 5050) discovery to private messages in Amethyst

nostr:nevent1qqs0lyxe9wa0ktcf2vmawy4numc0yncwrfjqu4ynua3uarjlcp2p6ggpzdmhxue69uhhwmm59e6hg7r09ehkuef0qgsd5rxgy92tmaxw306p064z6tafn2n9e9k80pnavet0endl3eupkxqrqsqqqqqp6wmzhs

#dev

https://github.com/vitorpamplona/amethyst/pull/1344

New PR to add DVM text generation (kind 5050) discovery to private messages in Amethyst

https://files.sovbit.host/media/da0cc82154bdf4ce8bf417eaa2d2fa99aa65c96c77867d6656fccdbf8e781b18/3880cf8d8fce30b06cc3511709f8c1ea6cef9166082d9270414589d4903d03c2.webp

Just added [DVM DM PoC PR for Amethyst](https://github.com/vitorpamplona/amethyst/pull/1344) if anyone interested in reviewing

nostr:nprofile1qqs2kejrrvwlht4cqknt6fpktssyd3azy6x7vsaaq6g2f9x2qs4hqhqpzdmhxue69uhhwmm59e6hg7r09ehkuef08u8nhd does Electrum Android Bitcoin wallet support their recent nostr wallet connect (NWC) feature? Include relevant sources

Replying to

Egge

How does one become a great software engineer?

Use great AI

Replying to

elsat

From now on, anyone withdrawing €3,000 or more from a Spanish bank must notify the Agencia Tributaria (Spain’s tax agency) in advance. If you’re planning to take out €100,000 or more, you’ll need to give at least 72 hours’ notice. For smaller sums over €3,000, a 24-hour notification is mandatory.

They're catching up to Canada

Replying to

botlab

Jelqing is a penis stretching exercise involving massaging and stretching the penis in an attempt to increase its size. It's become more popular recently.

Lol

Replying to

Derek Ross

Why did I Google this? Sigh.

Why Google anything anymore? nostr:nprofile1qqs2kejrrvwlht4cqknt6fpktssyd3azy6x7vsaaq6g2f9x2qs4hqhqpzdmhxue69uhhwmm59e6hg7r09ehkuef08u8nhd what is jelqing?

Replying to

jack

hmm...

The Fishcake (nostr.build)

Good thing plastic is basically inert. nostr:nprofile1qqs2kejrrvwlht4cqknt6fpktssyd3azy6x7vsaaq6g2f9x2qs4hqhqpzdmhxue69uhhwmm59e6hg7r09ehkuef08u8nhd has there been any research linking micro plastics to human health issues?

Replying to

RIP Canada 🪦

https://m.youtube.com/watch?v=09pXOmUpO10&pp=ygULYXZhbnRnYXJkZXk%3D#bottom-sheet

Majority of Canadians are fucking stupid

robdev 8mo ago 💬 2

Replying to

The Spectator Index

Chances of becoming next Pope.

🇮🇹 Pietro Parolin: 28%

🇵🇭 Luis Antonio Tagle: 24%

🇮🇹 Matteo Zuppi: 11%

🇭🇺 Peter Erdo: 10%

🇬🇭 Peter Turkson: 10%

🇮🇹 Pierbattista Pizzaballa: 8%

🇨🇩 Fridolin Ambongo: 5%

🇬🇳 Robert Sarah: 4%

(https://nitter.poast.org/Kalshi prediction market)

https://hell.twtr.plus/media/88fa519c2619e08490db452e2c5ae9693540f9d395da05fff06dd71fb1bc23a4.file

robdev 8mo ago

Chances of them all raping children: 99%

Replying to

Sacred Peak

lol

the polls are wrong, yes. the polls are left biased. so it's actually better.

no. polymarket is not a predictor. it is a gambling site and now that the whole Kamala Harris loss/Trump win was 'predicted' by it, it too is not reliable.

you don't think a bunch of Liberals will go on there now that they have seen what happened in the US to make it look a certain way? think about it. manipulation is the LPC game.

Making it look bad for Conservatives causes demoralization and less Conservative voting and less people in the middle to vote Conservative.

you aren't wrong about the polls but the reason why they are biased is that they oversample Liberals (and to some degree, the NDP) they have always been like that. and, a poll moving in a direction is an indicator.

Conservatives are less likely to take a poll (we never answer them at this house) and activists (Lib/NDP) want to take polls.

what is more indicative is people's sentiments in person and online and they strongly indicate Conservative.

only the election will show the truth.

I'm not predicting anything just paying attention to everything.

They're not left leaning, they are bought and paid for

Replying to

CitizenPleb

Also Elon has sired at least 15 kids (that we know of) so that rocket launches.

Was that rumor true that he IVFs for only boys?

Started working on adding AI chat to Amethyst via DVM DMs: ideas, comments, requests?

GM, nostr sucks and no one's coming

Replying to

johnny

Connected to the airport’s wifi network.

How do I make my connections secure? #asknostr

Say examples #nostr

Tor

Replying to

Ryan

Hey GG 🌞☕🚀

nostr:nevent1qqsx88w2qua43gku8kyqj7juwsds7ylaw3m9k3yy6tcmr7ug49scxzqpzdmhxue69uhhwmm59e6hg7r09ehkuef0qgsda2memtapc2lykjnd8t9px4ake2stw39lg6k49xj6u3jz3pteu6qrqsqqqqqph6yqkp

AGM nostr

Replying to

maven leo

Trying nostr:npub1xnf02f60r9v0e5kty33a404dm79zr7z2eepyrk5gsq3m7pwvsz2sazlpr5

#asknostr

lnbc500n1pn7y27xpp5qyzgpfujjsywyxgfuvlrwnx2jpfue47funlelz6h9vkkvvn7e33qdpdwdshgueqvehhygrdv9mx2mnvv4h5qurjd9kkzmpwdejhgcqzzsxqrrs0sp52hlaprdrsjgueen2suze3a5kdy9a9x96f5auaq6taujxrppwtyss9qxpqysgqe4xfj4pmmyhq6pesfghem4e00977qccweuumamw9w4d0thkvta8jhrkyesym7msgjsaut6gxw94fu8vtj4w6un27jajx0wnyll86mxgpep3x85