Nostr Web Client

This note contains no sarcasm.

What local LLM models are you using that actually seem to be useful? Please describe a successful use case you've had with it!

Daniel Wigton 11mo ago

Llama3.3 70B is the most useful that I have tried. Any smaller and you have to stick to very general knowledge. Asking for niche knowledge from a small model is a recipe for hallucinations.

I use it both for conversational AI to help me narrow search terms and for coding help via the Continue AI vscode plugin. I use, I think, deepSeek coder for auto complete.

The main drawback is that it is somewhat slow. I get 3.3 tokens per second which is equivalent to talking to someone who types 150 words per minute.

That is actually helpful because it is not so slow as to be intolerable but not so instant that I don't try to figure things out on my own first.

Is does require some decent hardware though. I've got a 4090 an 13900k and 64 gigabytes of RAM running at 6400 MT/s That last number is key. The 4bit quantization of llama3.3 is 42 GB. With 24GB of vram that leaves 18GB that have to be processed by the CPU for each token.

The result is that the GPU is actually not doing much. You probably don't need a 4090, just as much vram as you can get. A 5090 with 32 GB of vram should be able to do 6 tokens per second simply for having only 10GB to process on the CPU.

Replying to

YODL

Well done. I’ve got a humble little bouquet to pick up soon myself

Daniel Wigton 11mo ago

I don't have to venture out. My wife works at a hospital so all I have to do is go to the gift shop's website, select something nice, add to cart, checkout, enter details, get notified that I can't checkout as guest since I apparently already have an account, check password manager, come up empty, use the reset password feature, check for reset email, check junk folder for reset email, start all over using a different email.

Such convenient process. If only I could just do some sort of digital signature to identify myself.

Daniel Wigton 11mo ago

Didn't know what to do for Valentine's Day but luckily the flower shop sells these fancy Belgian style ales.

Replying to

Frederik Kjøll Iversen

Nostrians are my kind of people.

Daniel Wigton 11mo ago

Peace comes from knowing that you are not the source and are not the end. I have no internal knowledge or being that is not a gift from above. I can accept that and trust that the giver loves me. Agency comes from trusting the giver enough to know what to give me that I can use it freely.

Daniel Wigton 11mo ago

Does anyone else feel like USAID is just being used as a fall-guy for the CIA/FBI/NSA/(whoever is tasked with logging my notes)?

Daniel Wigton 11mo ago

Wordle 1,335 5/6

⬛🟨🟨⬛⬛

⬛🟨⬛⬛🟨

🟨⬛⬛🟨⬛

🟨🟩🟨🟨⬛

🟩🟩🟩🟩🟩

Replying to

Mike Dilger ☑️

I am in Magic Recipe Hell.

Is anyone familiar with Magic Recipe Hell? This is the place developers suffer when the start learning a new framework where every slightly different variation of functionality you want to implement requires a new magic incantation and cannot be built with any combination of the previous magic incantations you previously learned.

Daniel Wigton 11mo ago

Yes. It was the Zend Framework and Magento for me. I got so sick of having to learn new incantations that I created my own php mvc book of spells that instead of of doing "convention over configuration" I did convention by configuration. I.E. the framework would do magic however I want.

Still need to go back and finish modernizing that one.

Replying to

Silberengel

Voyage sends kind 1111 replies and I'm curious which clients have implemented them.

This is a kind 1111 note:

nostr:nevent1qqswg09ktc2d3lyktylfasr6gl9nk2jw4g6fmpvazjw05cn8v5t78tgzyr7jprhgeregx7q2j4fgjmjgy0xfm34l63pqvwyf2acsd9q0mynuzqcyqqqqg4cew5qtj

Daniel Wigton 11mo ago

Hold on. Which am I supposed to see? The note in which you ask the question or the note you quote?

Replying to Adnan

The enthusiasm surrounding DeepSeek stemmed from its cost-effectiveness. Even when it comes to inference, it offers a significant reduction in expenses while still delivering comparable quality output.

Daniel Wigton 11mo ago

Perhaps the full model is cheaper than o1 etc. but the distills are pointless. They have the exact same cost to run, per token, as the models they are based on but are worse than.

Replying to

Logen

I have the same results with 70B! It’s hot garbage taking up too much space. I find better results with phi4 or llama 3.1

Daniel Wigton 11mo ago

Yup. Give 3.1 or 3.3 instructions to show their work step by step and correct any mistakes and it does light years better. 3.3 is better even without a fancy prompt.

Replying to

Colby Serpa

Einstein was 26 when he derived E=mc2, Newton was 29 when he demonstrated light refraction, Louis DeBrogile was 32 when he discovered the wavelength equation, and Maxwell unified electricity with magnetism at 30.

Daniel Wigton 11mo ago

I was 46 when this finally stopped hurting. Only a year to go 😭

Daniel Wigton 11mo ago

I really don't get the deepSeek love. I haven't tried the full model, but the 70B parameter distill is trash. It isn't actually a reasoning model. It merely apes being a reasoning model. It is really good at sounding like it is reasoning but it hallucinates far more than the llama3.3 model on which it is based.

I suspect the full model has similar features. It is reassuring to users to see that it is attempting a rationalization but the actual output isn't that great.

Replying to

semisol

Let’s Encrypt is singlehandedly holding up the entire internet PKI

Daniel Wigton 11mo ago

Wait, are they losing USAID funding?

Replying to

YODL

Wordle 1,334 3/6

🟩⬛🟨⬛⬛

🟩🟩⬛🟨⬛

🟩🟩🟩🟩🟩

#wordle

Daniel Wigton 11mo ago

Wordle 1,334 3/6

🟨⬛🟨⬛⬛

⬛🟩🟨🟨⬛

🟩🟩🟩🟩🟩

Team nostr!

Daniel Wigton 11mo ago

It is nice how a really good furnace will just work and work until the coldest night of the year.

Replying to

YODL

Daniel Wigton 11mo ago

Inspiring. And at the center of dust?

Replying to

Mike Dilger ☑️

Ok with empty messages I get:

Signing: SHA-512=13.5us BLAKE3=13.5us (no difference, message was empty, this makes sense)

Verificatin: SHA-512=30.1us BLAKE3=30.4us (within margin of error)

Strict Verif: SHA-512=31.6us BLAKE3=30.9us (within margin of error)

with 4416 bytes of data I instead get:

Signing: SHA-512=25.5us BLAKE3=19.0us (25% faster)

Verif: SHA-512=30.0us BLAKE3=26.5us (11.6% faster)

S. Verif: SHA-512=38.3 BLAKE3=31.9us (16.7% faster)

So I was wrong. Obviously with no data you aren't going to see a difference.

I'm not sure EdDSA with BLAKE3 has any solid implementations out there though, and the standard specifies only SHA-512 so I would be non-compliant.

Lemme think..... non-compliant..... how does that sound.... Hmmmmm.

Yes, sounds great.

Daniel Wigton 11mo ago

Does your BLAKE3 version of Dalek exist anywhere that I can steal it?

Replying to

jb55

zap a tard today nostr:note1k5nctgcwydeldfe5vf054gz7tt85a9sv84e700qq3c4hg3eychxsr9fft9

Daniel Wigton 11mo ago

I've never been more relieved.

Replying to

Silberengel

Everyone keeps telling me that it doesn't matter, if any of it works, as it's only toys and you might lose some money, but money isn't everything and YOLO.

Daniel Wigton 11mo ago

You do only live once, so might as well do whatever you do correctly. You can't fix your life in post production.

This is why refuse to use millennial acronyms. Not going to have that blot on my life.