There has never been a better time to get your shit together.
Amazing that States have lasted as long as they have, honestly.
And if you gaze long enough into /dev/null, /dev/null will gaze back into you.
There’s no place like ~/
There’s a command line tool called websocat that you can use to pipe WebSocket output to standard output. I’ve used it to slurp some content, but haven’t used it to grab everything.
Problem is that most relays will limit the number of events they return, so you have to make multiple requests to get everything, for example by adding timestamp ranges.
Levels of fiat currency understanding:
0. Dollar is backed by gold
1. Nixon broke peg. Dollar is fractional reserve.
2. Nope. As of May 2020, no fraction is required or enforced.
Levels of securities understanding:
0. You own a share.
1. No, the broker owes you a share.
2. No, the broker owes you a “security entitlement”—a pro-rata fraction of remaining assets AFTER secured creditors are made whole.
In both cases, it’s worse than almost everyone realizes. Even the people that know something.
I haven’t heard of one, but sounds like a great idea! Where would you store the block and UTXO data?
#asknostr nostr:note1ywtjs5vuyccplz9vy8rma3vutlwkyel0jt2t8djh7j332vlk37zs4frx7c
I think for that you have to do training. If you’re not specifically running a training operation, then the results are limited by the model’s context window.
The context window is the number of tokens it can keep in mind and work on at a time. Current generation models have context windows of about 10k, meaning that anything you or it talked about further back than that is lost.
Note that some words are single tokens, but some words require multiple tokens. Also punctuation and white spaces take up tokens as well.
If your graphics card has the capability and sufficient vRAM to fit the model, I believe GPT4ALL and ollama will detect that and use it automatically.
My gaming laptop has an NVidia RTX 2090 with 6GB vRAM. GPT4ALL uses it automatically when the model is small enough to fit, and otherwise runs on CPU, in my experience.
Haha maybe! How did you make it?
For context, my usage was around brainstorming ideas for a sci-fi novel I intend to write. I tried llama3 briefly, then switched to dolphin-mixtral, a mixtral-based model that attempts to remove alignment and increase compliance.
I found that both had similar knowledge. For example, both were familiar with authors I wanted to reference like Robert Zubrin and Kevin Kelly.
Both seemed to think that climate change was a big issue that humanity needed to solve—even after I told them to ignore climate concerns for the purpose of the brainstorm.
Where they differed that I could tell was in bias around character design and in tone. Llama kept suggesting characters that were predominantly female doctors, mostly Indian/East Asian. Dolphin seemed more willing to suggest male characters.
Regarding tone, the prose llama3 produced tended to be chipper and light hearted, even when I prompted it to be dark and cerebral. Dolphin-mixtral was more willing to offer darker prose.
These are my general impressions so far.
If you’re thinking about running an LLM at home for the first time, here’s my top 4 tips:
1. Try GPT4ALL and/or ollama. These are launchers that help you download and interact with models. GPT4ALL is a GUI, while ollama is a command line program.
2. Current models come in roughly two sizes: 7B and 22B parameters. These are ~4GB and ~40GB respectively, but they can be even bigger. If your GPU has computational capability AND sufficient vRAM, then the models can be run on GPU. If not, they’ll run on CPU, but more slowly. Try a 4GB model to start.
3. Although there are a relatively small number of popular architectures (llama, mistral, etc.), there are lots of variants of models to choose from. Hugging Face (terrible name) is the site to browse for models.
4. “Alignment” is the new word for bias (particularly the philosophical/political kind). A model tweaked to be maximally compliant and unbiased is called “unaligned”. The big mainstream models are “aligned” with the companies that produced them. Find unaligned models if you want less biased results. (I’ve been happy with the Dolphin line of models by Cognitive Computations).
Good luck!
Do you have a GPU with enough vRAM to fit it? Or are you just running on CPU?
Also, if you’re in that size range already, consider the Dolphin line of models from cognitive computations. They’re trained to be unaligned and maximally compliant with user requests.
nostr:npub1v9qy0ry6uyh36z65pe790qrxfye84ydsgzc877armmwr2l9tpkjsdx9q3h can you use several GPUs for ollama?
I’ve heard that NVidia cards have some kind of link capability where you can use two together. I don’t know much about it, I’m relatively new to this space.

