From reading that article, Lexicons just sound like various note kinds, which are defined in the NIPS.
For instance, a kind 1 post is just a short text note and most clients can display kind 1 notes fine. They may make choices about how they render certain text within a kind 1. For instance, an image URL contained in a kind 1 is often rendered as the image itself. Same thing with video URLs. But a client could just render the URL text and make it clickable to open the media in a separate tab.
Meanwhile, there are also specific media kinds, too. Such as Kind 20 for images, kind 21 for videos, and kind 22 for short-form portrait videos.
You can find the definitions of all of these kinds and what they can, should, and must contain in the NIPs repo here:
https://github.com/nostr-protocol/nips
Most of them are pretty well defined. You will find that some clients don't always follow these definitions, though...