noStrudel should be adding nostr: prefixes to bech32 nostr tokens, but there maybe bugs in the stable version

I'm in the process of writing tests for this code in the next version of the app to ensure it happens 100% of the time. also the next version will only render embeds if they have the nostr: prefix so it should be more obvious when something is broken

Reply to this note

Please Login to reply.

Discussion

there is a problem with the recogniser/regexp for URLs also, it doesn't allow braces () but these are compliant and correct

might be a good idea to review any regex patterns you have got to see if there isn't some other creepy crawlies in there

URLs can't be recognized by regex, it's an impossible task. Just do simpler URLs, let's get rid of all characters, only ASCII visible symbols allowed and that's it.

i'm not talking about recognising blabla.com i'm saying a correctly formed URL, very easy to write a regex for them, but i don't have to because the stdlib in golang already has a full URL parsing library in net/url

i think you haven't read the RFC for it because the accepted ciphers in a URL are very clear, as is the structure of proto://someth-ing.some_thing.sometld/path/to/thing?something and the occasional same thing except with a # based section identifier is pretty clear and simple and the only character in standard printable ascii not permitted in parameter key and values are mostly just the & separator

i'm quite sure there is a javascratch library for doing this correctly according to every possible nonsense that might show up in an RFC compliant implementation

the most efficient solution will be a state machine parser, but they are a little bit of work to get right... i basically wrote such things for json last year but part of the trick to why it saves time is it assumes it's minified and a couple other things, there is a whole swathe of invalid constructions it would let past but miraculously nobody bothers to mangle them anyhow (i don't even handle whitespace except to skip it until the next expected type of token)

also, you are wrong, wrong wrong wrong... here's an example of two regexes i wrote to catch filesystem references both absolute and relative:

this is relative:, it assumes no ./ or anything, just the first filename to start:

^((([a-zA-Z@0-9-_.]+/)+([a-zA-Z@0-9-_.]+)):([0-9]+))

this is one that expects a space before a relative, i had to handle both because one is for the condition of start of line and one is the condition of in a typical stack trace (iirc)

[ ]((([a-zA-Z@0-9-_.]+/)+([a-zA-Z@0-9-_.]+)):([0-9]+))

and this is an absolute filepath:

([/](([a-zA-Z@0-9-_.]+/)+([a-zA-Z@0-9-_.]+)):([0-9]+))

as you can see starts with a / - and yes, i don't have one for the explicit relative because nobody does that except i have to do that with certain go tool commands like run and build and install

You just proved me right in your last paragraph. It works because the standard is not really a standard. The actual standard is simpler than what is described in the RFC, and it should be even simpler. No one really needs braces in URLs, so let's just stop using them.

"lets just stop" who? using them, oh you mean jimmy wales and his "free" propaganda rag wikipedia? good luck with that

also because your response is so hyperbolic i'm gonna assume you are not taking this seriously

I'll do my part by breaking his URLs in my software if I remember that.

this is exactly why nostr:npub1ye5ptcxfyyxl5vjvdjar2ua3f0hynkjzpx552mu5snj3qmx5pzjscpknpr made applesauce, but i would mention also that @hodlbod has written correct URL handling that doesn't choke on jimmy's crung URLs (and not only jimmy, you will find there is other idiots who think that being allowed to use braces means they can use braces by the RFC)

i jfc reading this, is this some kind of humor?

and i was amazed to learn that he also follows this idiot idea to not recognise brackets in a space bound URL, or intelligently separate the brackets from a fucking matching URL that starts with , you know, a fucking sentinel`http` really

guys, i dunno what to say about your abilities to reason at this point, i'm just looking at this and i'm baffled how this byzantine short circuit happens in your brains

This is intentional, plain () are not valid in URLs and I've found that there are far fewer URLs with brackets in them then there are people wrapping URLs in brackets

for example: I really like this site (https://nostrapps.com/)

Wikipidia and some other news sites are the only ones I've seen so far that consistently use brackets in their URLs without escaping them

and nobody uses wikipedia... only the second or third result in almost every websearch ever

what are you trying to say? that users without this knowledge should not be able to read links that others post assuming the client developer cares about this?

I'm saying there are fewer broken links if I exclude () then if I include them

that's because you aren't using a primary heuristic to scan the text for`http`

it's really really REALLY simple if you just think about it

the regexp is not so much for finding but for validating

the brackets at the end of a url before the next space can easily be ignored as a typical human fucking

idk what to say guys

seriously, the rest of teh internet figured tihs out 10 years ago

just as a test to see if #coracle does this correctly

(https://realy.lol)

nostr:nprofile1qyvhwumn8ghj76rzwghxxmmjv93kcefwwdhkx6tpdshsz9thwden5te0wfjkccte9ejxzmt4wvhxjme0qy88wumn8ghj7mn0wvhxcmmv9uq3qamnwvaz7tm99ehx7uewd3hkctcppemhxue69uhkummn9ekx7mp0qqsf03c2gsmx5ef4c9zmxvlew04gdh7u94afnknp33qvv3c94kvwxgs9w8d3z little tip with the URL recognition, if the http has ( before it you probably can ignore the last )

it is VERY rare situation that someone places a URL, inside brackets, and remembers to close them without adding a space between

so rare, that you can basically assume that if a URL has )) at the end, the second one is probably invalid

Your proof that it is rare that users forget spaces when using brackets is the same as my proof that there are more uses that use brackets then URLs that have them

In other words its all pretty much subjective

no, it's really not, because there is not whitespace inside URLs, URLs are almost always bounded by whitespace

thus, when you find whitespace, it marks boundaries around the matched URL and you know that a second ... fucking... close brace... is actually, not part of the URL! idk how to say this couldn't be more obvious

it's a small piece of logic that you have been spoiled to never learn how to actually write a lexical analyser and this reminds me that the majority of the devs in the world couldn't write a lexical analyser if their life depended on it

i learned how to do it, working from a text that was part of first year computer science when i was 15 years old

you guys both just utterly lost my respect tonight

also, you are saying this to someone who wrote a fully effective json parser from scratch for the entire nostr protocol spec, that is fully compliant, and always works, except for a few wrongly escaped json strings from old events mostly