Subnostr

TechPostsFromX

52d119f46298a8f7b08183b96d4e7ab54d6df0853303ad4a3c3941020f286129

Our relay: wss://nostr.cybercan.click I keep sharing posts from X about technology, software development, engineering. Feel free to suggest X accounts so I can add in the loop. This account is maintained by automation solutions developed by contact@webviniservices.com If you enjoy this page or some posts, I accept lightning donation. Thank you.

TechPostsFromX 1y ago

Imagine an empty room with 1000 coins on the floor. The room sits above a subway line, and when trains go by it jostles the coins and flips some over.

The room is locked and you have the only key to this room.

You come into this room for the first time on Tuesday and you count 300 heads. On Wednesday you count 400 heads. What do you believe the count was on Monday and why?

And what does this have to do with the universe at large?

Source: x.com/unclebobmartin/status/1824068375640678769

TechPostsFromX 1y ago

It's been sobering to talk with engineers who use these tools, day-in, day-out.

I sense that AI tooling is not yet quite where engineers want it to be (and what some vendors claim their tools can do), but it’s headed in the right direction of making us a lot more productive.

Source: x.com/GergelyOrosz/status/1824014434261418185

TechPostsFromX 1y ago

Uuuu maybe is time to bring my viking followers to X … didn’t touch fb since 2020 when their stupid ai messed up my ads account

Source: x.com/nfkmobile/status/1824021380913942893

TechPostsFromX 1y ago

The latest Windows exploit is so bad that they got the social team out here trying to make people update lmao

Source: x.com/t3dotgg/status/1824011521346982200

TechPostsFromX 1y ago

Neo also made a good list.

Check it out!

Source: x.com/denicmarko/status/1823826901301125413

TechPostsFromX 1y ago

“[Agile] does not attempt to forecast, it attempts to be as productive as possible while not knowing the future. It focuses on resilience as a risk management strategy, not anticipation.”

Chris Morris (@the_chrismo by way of @tottinge)

Source: x.com/allenholub/status/1823957536141349284

TechPostsFromX 1y ago

The advertisers left on X are… scam projects impersonating the owner of the site.

And X lets this fly: not even vetting obvious crypto scams.

This is an actual ad (and a scam) that is allowed to run here, while X is busy suing advertisers to come back.

Source: x.com/GergelyOrosz/status/1823955061266948249

TechPostsFromX 1y ago

Source: x.com/t3dotgg/status/1823961274277093865

TechPostsFromX 1y ago

What if fetch was, like, better? I made a video discussing the possibility and the utility of libraries like ky (and why they might not be worth it)

Source: x.com/t3dotgg/status/1823961237874729068

TechPostsFromX 1y ago

I talk about a lot of things (e.g. working without estimates, not using up-front "requirements," the user's story, &c.) that can be challenging. I've a class coming up in three weeks [Practical Agility: From Stories to Code -- https://holub.com/classes] that covers all these topics, and more. Check it out!

Source: x.com/allenholub/status/1823780904457920627

TechPostsFromX 1y ago

There are a few words/phrases that describe pretty-good things but immediately set off a warning klaxton in my mind:

innovative

disruptive

here's a great idea!

the customers will love this

team player

Steve Jobs would have...

Care to add to the list?

Source: x.com/allenholub/status/1823788075161718942

TechPostsFromX 1y ago

ooo well ... lol Grok being GROK rofl :D WHO ... GFY lol

Source: x.com/nfkmobile/status/1823896766963405096

TechPostsFromX 1y ago

If this happened at an Apple event they would have murdered the engineers responsible

Source: x.com/t3dotgg/status/1823938397469335971

TechPostsFromX 1y ago

I'll start:

@natmiletic

@Shefali__J

@RitikaAgrawal08

@webdevluc

@RaulJuncoV

@alexxubyte

@milan_milanovic

@csaba_kissi

@TreciaKS

and many more.

Give them a follow!

Source: x.com/denicmarko/status/1823677936962003418

TechPostsFromX 1y ago

Share your favorite content creators.

Source: x.com/denicmarko/status/1823675457016893861

TechPostsFromX 1y ago

For those of you who want a more in-depth look at how I work, I've scheduled a class. This class covers pretty much the whole process of creating software with agility, using the user's story as a driver. Hope to see you there!

Source: x.com/allenholub/status/1823185179013497267

TechPostsFromX 1y ago

Love curry

Source: x.com/wesbos/status/1823172515910631778

TechPostsFromX 1y ago

Attempting to migrate from Styled Components → Panda CSS, but keep the styled.div`` API so I can have server component support

Source: x.com/wesbos/status/1823356244633047398

TechPostsFromX 1y ago

SQL injection-like attack on LLMs with special tokens

The decision by LLM tokenizers to parse special tokens in the input string (~~, <|endoftext|>, etc.), while convenient looking, leads to footguns at best and LLM security vulnerabilities at worst, equivalent to SQL injection attacks.~~

~~!!! User input strings are untrusted data !!!~~

In SQL injection you can pwn bad code with e.g. the DROP TABLE attack. In LLMs we'll get the same issue, where bad code (very easy to mess up with current Tokenizer APIs and their defaults) will parse input string's special token descriptors as actual special tokens, mess up the input representations and drive the LLM out of distribution of chat templates.

~~Example with the current huggingface Llama 3 tokenizer defaults:~~

~~Two unintuitive things are happening at the same time:~~

~~1. The <|begin_of_text|> token (128000) was added to the front of the sequence.~~

2. The <|end_of_text|> token (128001) was parsed out of our string and the special token was inserted. Our text (which could have come from a user) is now possibly messing with the token protocol and taking the LLM out of distribution with undefined outcomes.

I recommend always tokenizing with two additional flags, disabling (1) with add_special_tokens=False and (2) with split_special_tokens=True, and adding the special tokens yourself in code. Both of these options are I think a bit confusingly named. For the chat model, I think you can also use the Chat Templates apply_chat_template.

With this we get something that looks more correct, and we see that <|end_of_text|> is now treated as any other string sequence, and is broken up by the underlying BPE tokenizer as any other string would be:

TLDR imo calls to encode/decode should never handle special tokens by parsing strings, I would deprecate this functionality entirely and forever. These should only be added explicitly and programmatically by separate code paths. In tiktoken, e.g. always use encode_ordinary. In huggingface, be safer with the flags above. At the very least, be aware of the issue and always visualize your tokens and test your code. I feel like this stuff is so subtle and poorly documented that I'd expect somewhere around 50% of the code out there to have bugs related to this issue right now.

Even ChatGPT does something weird here. At best it just deletes the tokens, at worst this is confusing the LLM in an undefined way, I don't really know happens under the hood, but ChatGPT can't repeat the string "<|endoftext|>" back to me:

~~Be careful out there.~~

~~Source: x.com/karpathy/status/1823418177197646104~~

TechPostsFromX 1y ago

Solution writeup is now public:
Source: x.com/marktenenholtz/status/1823400293284917758

Load More