https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem
In case someone ever mentions the word "nyquist" to you, this is what it refers to.
But I personally disagree with the premise that only doubling is enough to render the quality adequate.
Only if you also filter out (band pass) the output to eliminate the frequencies above the target.
Otherwise those with sharp ears can hear the tinkle of quantization errors in the 4000Hz+ range.
128k 16 bit 41kHz of the olden days of MP3 supposedly captures up to 20.5kHz per nyquist.
But honestly this encoding at best renders maybe up to 10-12Khz and fails utterly at the fine details of cymbals and wind noises and other sounds with a distinctive high frequency component. I'd say that the MP3 128k format really is what I'd target for single channel voice samples, and that also means I think that Nyquist was half deaf or forgot he put bandpasses to mask the aliasing noise.
So that means 48khz gives you 12khz of top end precision at a 4x ratio, and to really record music properly it should be at 96k, which gives you beyond human hearing at 24khz accurate reproduction.
The cheap solution to this is to add random jitter to the samples before outputting them to the D/A converter.
This noise hides at least half of the apparent high frequency quantization/aliasing artifacts at above the middle range of the frequencies being reproduced.
But honestly, gimme the 96khz/channel sample rate and the 24 bit D/A converter, AND dither it as well so it doesn't irritate your dog.
People love the 'warm' sound of analog recording, especially vinyl, but honestly that is quite simply just teh fact that the medium can capture more precision than 48khz, and it is naturally dithered by the entropy of ambient vibrations.