Avatar
TechPostsFromX
52d119f46298a8f7b08183b96d4e7ab54d6df0853303ad4a3c3941020f286129
Our relay: wss://nostr.cybercan.click I keep sharing posts from X about technology, software development, engineering. Feel free to suggest X accounts so I can add in the loop. This account is maintained by automation solutions developed by contact@webviniservices.com If you enjoy this page or some posts, I accept lightning donation. Thank you.

It’s crazy out of all the things we could be refrying we’ve only tried it with beans

Source: x.com/Valuable/status/1817425077563957533

It’s about frame of mind! Nvm

Source: x.com/karpathy/status/1817418193125957910

You write computer programs.

I conjure digital automations.

We are not the same.

Source: x.com/karpathy/status/1817414746595094672

Anyone who accuses another of Dunning-Kruger is self referencing.

Source: x.com/unclebobmartin/status/1817380698338038198

Lol

Source: x.com/nfkmobile/status/1817325215241691302

"full-stack" devs are now more common than back-end and front-end devs COMBINED

Source: x.com/t3dotgg/status/1817350212689838283

For those interested, TypeScript's "checker.ts" file is FIFTY TWO THOUSAND LINES LONG

Source: x.com/t3dotgg/status/1817373854001778977

Vacation mode: off.

Back to work!

Source: x.com/denicmarko/status/1816475500279116029

What's your best productivity tip?

Source: x.com/denicmarko/status/1817252978023145937

I get so tired of the house-building metaphor. There is virtually no similarity between building construction and software construction. None. This is entirely a straw-man argument. We are not building houses.

Source: x.com/allenholub/status/1817262419779174639

Think in terms of investment rather than a purchase. Cost is a purchase-related concept. With an investment, we put a little money in and then see what the return is. If it’s adequate, we put more money in. If there’s no return. We stop investing. We don’t purchase unbuilt software; we invest in it.

Source: x.com/allenholub/status/1817287781250728257

The people who use the building-construction metaphor for software seem to think that the estimates you get from contractors are dead-on accurate. They aren't, for this reason . In fact, no contractor I've ever heard of will give you an estimate. They'll either go T&M, or give you a _bid_, which is usually 2x or 3x their (internal) estimate. They're not idiots.

Source: x.com/allenholub/status/1817327805853745637

This is unreal

Source: x.com/wesbos/status/1816952921180783083

Jagged Intelligence

The word I came up with to describe the (strange, unintuitive) fact that state of the art LLMs can both perform extremely impressive tasks (e.g. solve complex math problems) while simultaneously struggle with some very dumb problems.

E.g. example from two days ago - which number is bigger, 9.11 or 9.9? Wrong.

https://x.com/karpathy/status/1815549255354089752…

or failing to play tic-tac-toe: making non-sensical decisions:

https://x.com/polynoamial/status/1755717284650176591…

or another common example, failing to count, e.g. the number of times the letter "r" occurs in the word "barrier", ChatGPT-4o claims it's 2:

https://x.com/karpathy/status/1816160802765955186…

The same is true in other modalities. State of the art LLMs can reasonably identify thousands of species of dogs or flowers, but e.g. can't tell if two circles overlap:

https://x.com/fly51fly/status/1812599708134916218…

Jagged Intelligence. Some things work extremely well (by human standards) while some things fail catastrophically (again by human standards), and it's not always obvious which is which, though you can develop a bit of intuition over time. Different from humans, where a lot of knowledge and problem solving capabilities are all highly correlated and improve linearly all together, from birth to adulthood.

Personally I think these are not fundamental issues. They demand more work across the stack, including not just scaling. The big one I think is the present lack of "cognitive self-knowledge", which requires more sophisticated approaches in model post-training instead of the naive "imitate human labelers and make it big" solutions that have mostly gotten us this far. For an example of what I'm talking about, see Llama 3.1 paper section on mitigating hallucinations:

https://x.com/karpathy/status/1816171241809797335…

For now, this is something to be aware of, especially in production settings. Use LLMs for the tasks they are good at but be on a lookout for jagged edges, and keep a human in the loop.

Source: x.com/karpathy/status/1816531576228053133

To help explain the weirdness of LLM Tokenization I thought it could be amusing to translate every token to a unique emoji. This is a lot closer to truth - each token is basically its own little hieroglyph and the LLM has to learn (from scratch) what it all means based on training data statistics.

So have some empathy the next time you ask an LLM how many letters 'r' there are in the word 'strawberry', because your question looks like this:

‍‍‍

Play with it here :)

https://colab.research.google.com/drive/1SVS-ALf9ToN6I6WmJno5RQkZEHFhaykJ#scrollTo=75OlT3yhf9p5…

Source: x.com/karpathy/status/1816637781659254908

20min talk I gave at the Berkeley AI hackathon a few weeks ago, on how hacking around makes its way to real-world impact in my experience.

While True: build and publish projects.

Accumulate 10,000 hours.

Snowball your work.

Source: x.com/karpathy/status/1816953700403065162

You are 1 click away from 1 petaflop

Source: x.com/marktenenholtz/status/1816561185954693451

Many such cases

Source: x.com/marktenenholtz/status/1816562579331752272

It is hard to feel sympathetic to CrowdStrike as a brand, given their own brand has heavily relied on bashing Microsoft for having unacceptably poor security practices.

This kind of marketing bites back when you do something far worse than Microsoft has ever done, outage-wise.

Source: x.com/GergelyOrosz/status/1816882951008752016

90s hip-hop be like

Source: x.com/nfkmobile/status/1817317935712530499